Syllable Forum Index Syllable
Syllable Forums
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

0.6.6: Very slow copying files with large attributes

 
Post new topic   Reply to topic    Syllable Forum Index -> Bugs
View previous topic :: View next topic  
Author Message
Morph
The Knights of Syllable


Joined: 18 Oct 2007
Posts: 316
Location: Australia

PostPosted: Thu Sep 04, 2008 10:19 am    Post subject: 0.6.6: Very slow copying files with large attributes Reply with quote

Create a file (on AFS) with a large attribute, eg:
touch myfile
dd if=/dev/zero of=some-data bs=1k count=512
addattrib myfile largeattrib -f some-data
[i](note: addattrib -f only works in 0.6.6 dev builds)[/i]

Then copy it:
cp myfile myfile-copy

The copy takes a long time (minutes) and uses 100% CPU.
(I suspect this is one symptom of some deeper problems with AFS's current attribute code.)
Back to top
View user's profile Send private message Visit poster's website
doneill



Joined: 16 Oct 2009
Posts: 40

PostPosted: Fri Oct 16, 2009 8:44 pm    Post subject: Reply with quote

is this sorted? it's something i don't mind looking into if not.
Back to top
View user's profile Send private message
Kaj
The Knights of Syllable


Joined: 14 Sep 2007
Posts: 2204
Location: Friesland

PostPosted: Fri Oct 16, 2009 9:02 pm    Post subject: Reply with quote

As far as I know it's not fixed. It would be great if you could find out anything about it.
Back to top
View user's profile Send private message Visit poster's website
doneill



Joined: 16 Oct 2009
Posts: 40

PostPosted: Fri Oct 16, 2009 10:26 pm    Post subject: Reply with quote

further research done.

basically big attributes don't work period.

anything over a small attribute in size doesn't work properly. in fact, it'll lock up nice and solid, so far my system hasn't self-recovered, i've had to reboot on each test.

dd if=/dev/random of=largedata bs=1k count=2
echo 'Hello, world.' > testfile
addattrib testfile largedata -f largedata
lsattribs testfile

from what i can make out, a 'big' attribute is one which contains references to inodes (or vnodes, i'm not sure which is which) where the attribute data is stored.

while a 'small' attribute (referenced typically in the code as 'sd') fits in the actual attribute inode itself.

i do have a reasonable grasp on how a typical filesystem operates, but i'm having some difficulty following the code, so please bear with me while i riddle it with printks on my local copy as i figure out what's happening in general Smile
Back to top
View user's profile Send private message
Vanders
The Knights of Syllable


Joined: 14 Sep 2007
Posts: 849

PostPosted: Sat Oct 17, 2009 4:58 am    Post subject: Reply with quote

You are correct: a "big" attribute is one which spans one or more blocks independently of the inode to which is associated. Some other FS's to something similar I.e. ReiserFS makes a distinction between "small" & "large" xattr's.

[quote="doneill"]i'm having some difficulty following the code[/quote]

Yes I'm afraid the code is rather...dense. Which may explain why we don't have much success in finding & fixing these sorts of bugs.
Back to top
View user's profile Send private message Send e-mail
doneill



Joined: 16 Oct 2009
Posts: 40

PostPosted: Tue Oct 26, 2010 4:20 am    Post subject: Reply with quote

i was reading over btrfs and found they have adopted a policy where extended file attributes are kept quite small. i spoke very briefly with one of the devs who in hardly so many words stated that anything beyond a 1-inode deep attribute would be a waste:

tracking disk usage is difficult
writing/reading is difficult
.. well really, lots of points, ultimately this bug could be a non issue if a fixed limit of attributes is defined. i'm not suggesting this, but certainly this bug needs a solution or at least a temporary workaround, since the result otherwise is a nasty crash.
Back to top
View user's profile Send private message
Kaj
The Knights of Syllable


Joined: 14 Sep 2007
Posts: 2204
Location: Friesland

PostPosted: Tue Oct 26, 2010 5:21 am    Post subject: Reply with quote

Hmm, I don't really see why an extended attribute couldn't be managed very much like the main file data. Why would it be more difficult to manage the data of an attribute than the data of the main file?
Back to top
View user's profile Send private message Visit poster's website
doneill



Joined: 16 Oct 2009
Posts: 40

PostPosted: Tue Oct 26, 2010 6:19 am    Post subject: Reply with quote

for an example in particular, "ls -l" in a terminal lists files with sizes, but should this reflect the size of the contents of the file only, or this size plus the size of all attributes associated with the file? if not, switches to the 'ls' utility would need to be added to specifically list one, the other, or both values.

i'm not sure if there's a way when copying files to retain the file attributes between the source to an attribute-compatible destination volume, but this is another example where such attribute provisions would have to be considered.

does setting/deleting an attribute update the file's mtime? should reading an attribute be reflected in the atime of a file? really, none of these operations even touch the file at the filesystem level, but the average user may not realize this.

certainly limiting the size of the attributes doesn't mitigate these issues, but i believe it makes them less consequential in the end for a more philosophical reason:

it could be assumed that data stored in extended attributes may be essential to the usage of the file in a program, for example, a "Content-Handler" attribute with a value of "php4" to instruct a web server to use PHP to interpret the file. with this, a file extension may not otherwise be specified (it would not be necessary), and archiving this file in a TAR file or over a networked filesystem/scm (FTP/RSync/CVS/etc.) would likely result in the loss of such attributes.

one of my hobby projects (a webserver) actually allows specifying the content type and handler of files in a document root, and even retains a hit-counter on each resource as the files are requested, so i have a stake in this. Wink

lastly i expect attributes spanning multiple inodes may be more difficult to journal, consolidate, or fsck properly. i'm merely assuming this, however.

my overall opinion is simply that *if* the use of file attributes are to be encouraged, mechanisms and provisions for them must be addressed universally (at least within the scope of Syllable Desktop and Server).
Back to top
View user's profile Send private message
Kaj
The Knights of Syllable


Joined: 14 Sep 2007
Posts: 2204
Location: Friesland

PostPosted: Tue Oct 26, 2010 8:06 am    Post subject: Reply with quote

Those are all considerations for extended attributes in general, not for the specific implementation of large attributes. Syllable has followed the choice of the Apple Mac, BeOS, AtheOS and many Unix filesystems to support extended attributes from the beginning. So all these considerations are not matters of whether to choose to support extended attributes or not, but matters of how best to implement and use them.

For example, the Syllable version of the cp command (now in the GNU CoreUtils package) has had a patch to copy extended attributes from very early Syllable versions on. Much of the other support is in native Syllable subsystems and applications, so the lower level system matters less there.
Back to top
View user's profile Send private message Visit poster's website
Morph
The Knights of Syllable


Joined: 18 Oct 2007
Posts: 316
Location: Australia

PostPosted: Mon Nov 01, 2010 12:51 am    Post subject: Reply with quote

[quote="doneill"]btrfs... have adopted a policy where extended file attributes are kept quite small.[/quote]
You mean there's a fixed size limit to extended attributes? "640k should be enough for anyone"... Smile
Back to top
View user's profile Send private message Visit poster's website
doneill



Joined: 16 Oct 2009
Posts: 40

PostPosted: Mon Nov 01, 2010 9:34 am    Post subject: Reply with quote

i think limiting the size of file attributes is rather wise, actually. i imagine a 2KB file with 2.5 million attributes, each ranging in size between 2 bytes and 3GB...

ridiculous? yes. but without any defined limit, this situation is both theoretically possible AND must be allowed/possible with the implementation. also, it would be expected to be generally efficient, too.

it seems to me that this is asking quite a lot.
Back to top
View user's profile Send private message
Morph
The Knights of Syllable


Joined: 18 Oct 2007
Posts: 316
Location: Australia

PostPosted: Mon Nov 01, 2010 6:23 pm    Post subject: Reply with quote

Some of the uses of attributes that we've discussed or considered are:
* caching directory state data such as icon positions, selections, navigation history
* storing preview clips for video files
* caching the directory listing and file offsets of a tar file
* storing diffs for persistent modifications to a read-only filesystem
as well as for 'regular' metadata like mime type. Some of these would potentially produce quite large attributes (several megabytes). To support these applications nicely, you'd need a large size limit, several hundred MB or so. But then we should ask if that's much better than just having no limit at all!

Of course, 2.5 million attributes is quite extreme - but so would be a regular directory with 2.5 million files. I doubt many filesystems would handle that very well.
Back to top
View user's profile Send private message Visit poster's website
NecroRomancist



Joined: 18 Sep 2007
Posts: 272

PostPosted: Sun May 08, 2011 5:33 am    Post subject: Reply with quote

[quote="Morph"]Some of the uses of attributes that we've discussed or considered are:
* caching directory state data such as icon positions, selections, navigation history
* storing preview clips for video files
* caching the directory listing and file offsets of a tar file
* storing diffs for persistent modifications to a read-only filesystem
as well as for 'regular' metadata like mime type. Some of these would potentially produce quite large attributes (several megabytes). To support these applications nicely, you'd need a large size limit, several hundred MB or so. But then we should ask if that's much better than just having no limit at all!

Of course, 2.5 million attributes is quite extreme - but so would be a regular directory with 2.5 million files. I doubt many filesystems would handle that very well.[/quote]

Not sure i agree with all of that. I see it as storing BLOB's in a database. I for one only keep reference to files on the filesystem unless there is really a good reason putting in the database. I'd keep references for that info you mention but not the real data
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Syllable Forum Index -> Bugs All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group