Short version
The question & answers relate to the following topics:
- File as a collection of records vs. file as a stream of bytes
- Sequential File
- Distinction between file data and metadata
- Transfer of metadata along with the file
- File size measurement
Original question
According to William Stallings' Operating Systems, the most common structure for a file is the sequential file, which consists of a collection of records. All files have the same amount of records and each record has a predetermined length. This seems very inefficient and doesn't match my previous understanding of files at all.
For example: if I have a JPG file, it would have records indicating things like resolution, creation date, etc. Some record fields would not be relevant for a JPG file and would be empty, but space would be reserved for those fields regardless. Since the actual data of the JPG file would be too large for any record, there would be an "overflow pointer" to where the data actually is.
Stallings also presents an alternative to file-as-a-collection-of-records: file as a stream of bytes. While the book claims it's rare, it is more in line with my understanding of files: a file is just raw data with no universal structure. However, if I move a file from one computer to another, things like "creation date" move along with it, indicating that files have some attributes which are saved within the file (as opposed to in the file management system). If all files have some header information like creation date, and that information is saved within the file, why is it that my Ubuntu claims an empty text file takes 0 bytes space?
I'm guessing that a typical file is simply a stream of bytes with no predetermined structure and that header information like the creation date is somehow transferred along with the file when it is moved. I'm guessing that the chapter about sequential files in the book is some kind of remnant from the 80's. Please correct any misconceptions that I have.
Asked By : Atte Juvonen
Answered By : Yuval Filmus
I've never heard of sequential files, but apparently the term can mean two things:
- Data which can only be accessed sequentially, for example the contents of a backup tape.
- Record-based files (what Stallings refers to). Apparently this is still used in databases (see for example this COBOL page), but not on your desktop.
File names and other metadata such as the creation date is stored as part of the filesystem. An empty file doesn't really "waste" only zero bytes, since the metadata is stored somewhere. When you move data from filesystem to filesystem, an attempt will be made to copy the metadata, but how successful this attempt is depends on the filesystem. For example, when moving a file from one server to another, the user and group attributes may be lost, since a given user or group can exist on one server but not the other one.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/55893
 
0 comments:
Post a Comment
Let us know your responses and feedback