The file system of a personal computer is the most. Files and file system

A variable-length object called file.

File - is a named sequence of bytes of arbitrary length. Since a file can have zero length, creating a file involves giving it a name and registering it in the file system - this is one of the OS functions.

Typically, data belonging to the same type is stored in a separate file. In this case, the data type determines file type.

Since there is no size limit in the file definition, one can imagine a file having 0 bytes (empty file), and a file having any number of bytes.

When defining a file, special attention is paid to the name. It actually carries address data, without which the data stored in the file will not become information due to the lack of a method to access it. In addition to addressing-related functions, a file name can also store information about the type of data contained in it. This is important for automatic tools for working with data, because based on the file name (or rather, its extension), they can automatically determine an adequate method for extracting information from the file.

File structure - hierarchical structure in which the operating system displays files and directories (folders).

Serves as the top of the structure carrier name, where files are saved. Next, the files are grouped into directories (folders), within which can be created nested directories

Names of external storage media. The disks on which information is stored on the computer have their own names - each disk is named with a letter of the Latin alphabet, followed by a colon. So, floppy disks are always assigned letters A: And IN:. The logical drives of the hard drive are named starting with the letter WITH:. All logical drive names are followed by CD drive names. For example, installed: a floppy disk drive, a hard drive divided into 3 logical drives and a CD drive. Identify the letters of all storage media. A:- floppy disk drive; WITH:, D:, E:- logical drives of the hard drive; F:- CD drive.

Logical drive or volume(English) volume or English partition) - part of the computer's long-term memory, considered as a whole for ease of use. The term "logical disk" is used in contrast to "physical disk", which refers to the memory of one specific disk medium.

For the operating system, it does not matter where the data is located - on a laser disk, in a hard drive partition, or on a flash drive. To unify the represented areas of long-term memory, the concept of a logical disk is introduced.

In addition to the stored information, the volume contains a description of the file system - as a rule, this is a table listing all files and their attributes (File Allocation Table, FAT). The table determines, in particular, in which directory (folder) a particular file is located. Thanks to this, when moving a file from one folder to another within the same volume, the data is not transferred from one part of the physical disk to another, but simply changes the entry in the file allocation table. If a file is transferred from one logical drive to another (even if both logical drives are located on the same physical drive), physical data transfer will necessarily occur (copying with further deletion of the original if successful).

For the same reason, formatting and defragmenting each logical drive does not affect the others.

Catalog (folder) - disk space (a special system file) that stores service information about files (name, extension, creation date, size, etc.). Directories at lower levels are nested within directories at higher levels and are for them nested. The top-level directory (superdirectory) in relation to lower-level directories is called the parent directory. The top level of nesting of the hierarchical structure is root directory disk (Fig. 1). The directory that the user is currently working with is called current.

The rules for naming a directory are no different from the rules for naming a file, although it is not customary to specify name extensions for directories. When writing a file access path through a system of subdirectories, all intermediate directories are separated by a specific symbol. Many operating systems use "\" (backslash) as this character.

The requirement for a unique file name is obvious - without this it is impossible to guarantee unambiguous access to data. In computer technology, the requirement of name uniqueness is ensured automatically - neither the user nor the automation can create a file with a name identical to an existing one.

When a file is used that is not in the current directory, the program accessing the file needs to indicate where exactly the file is located. This is done by specifying the path to the file.

The path to the file- this is the name of the media (disk) and a sequence of directory names, separated by the “\” character in Windows OS (the “/” character is used in UNIX line OS). This path specifies the route to the directory in which the desired file is located.

There are two different methods used to specify the file path. In the first case, each file is given absolute path name (full file name), consisting of the names of all directories from the root to the one that contains the file, and the name of the file itself. For example, the path C:\Abby\Doc\otchet.doc means that the root directory of the disk WITH: contains a directory Abby, which in turn contains a subdirectory Doc where the file is located report.doc. Absolute path names always begin with the media name and root directory and are unique. Applies also relative path name. It is used together with the concept current directory. The user can designate one of the directories as the current working directory. In this case, all pathnames that do not begin with a delimiter character are considered relative and counted relative to the current directory. For example, if the current directory is C:\Abby, then to the file with absolute path C:\Abby\ you can contact me as Doc\otchet.doc.

Due to the fact that the file structure of a computer can be significant, search for the necessary documents by simply navigating the file structure is not always convenient. It is usually believed that every computer user should know (and remember) the structure of the folders in which he stores documents. However, there are times when documents are saved outside of this structure. For example, many applications save documents to default folders if the user has forgotten to explicitly specify where the document should be saved. This default folder can be the folder that was last saved, the folder in which the application itself is located, some kind of service folder, for example \ My Documents and so on. In such cases, document files may be “lost” in the mass of other data.

The need to search for files especially often arises during setup work. A typical case is when, in search of the source of uncontrolled changes in the operating system, you need to find all the files that have been changed recently. Automatic file search tools are also widely used by specialists who set up computer systems - it is difficult for them to navigate the file structure of “someone else’s” personal computer, and searching for the necessary files by navigation is not always productive for them.

Primary search tool Windows XP launch from the Main Menu with the command Start > Find > Files and Folders. Another launch option is no less convenient - from any folder window (View > Explorer Bars > Search > Files and Folders or key F3).

The controls provided on the search panel allow you to localize the search area based on the available information about the file name and address. Wildcard characters are allowed when entering a file name «*» And «?» . Symbol «*» replaces any number of arbitrary characters, and the character «?» replaces any one character. So, for example, searching for a file named *.txt will end with all files having a name extension displayed. txt, and the result of searching for files with the name *.??t will be a list of all files with name extensions. txt, .bat, .dat and so on.

When searching for files with “long” names, you should keep in mind that if the “long” name contains spaces (and this is acceptable), then when creating a search task, such a name should be enclosed in quotes, for example: "Current work.doc".

The search bar has additional hidden controls. They appear when you click on the downward expanding arrow.

· Question When were the last changes made? allows you to limit the search scope by the date the file was created, last modified, or opened.

· Question What is the file size? allows you to limit your search to files of a certain size.

· Paragraph Extra options allows you to specify the file type, allow viewing of hidden files and folders, and set some other search options.

In cases where an unformatted text document is being sought, it is possible to search not only by file attributes, but also by its content. The desired text can be entered in the field A word or phrase in a file.

Searching for a document based on a text fragment does not produce results if it is a document that has formatting, because the formatting codes violate the natural sequence of text character codes. In these cases, you can sometimes use the search tool that comes with the application that formats the documents.

19.Data compression and file archiving.

A characteristic feature of most “classical” data types that people traditionally work with is a certain redundancy. The degree of redundancy depends on the type of data. In addition, the degree of data redundancy depends on the coding system adopted. So, for example, we can say that encoding text information using the Russian language (using the Russian alphabet) gives on average 20-30% more redundancy than encoding adequate information using the English language.
Redundancy also plays an important role in information processing. However, when it comes not to processing, but to storing finished documents or transmitting them, redundancy can be reduced, which gives the effect of data compression.
If information compression methods are applied to finished documents, then the term data compression is often replaced by the term data archiving, and the software tools that perform these operations are called archivers.
Depending on the object in which the data being compressed is located, there are:
- compaction (archiving) of files;
- compaction (archiving) of folders;
- disc compaction.
If data content changes during data compression, the compression method is irreversible and when data is restored from a compressed file, the original sequence is not completely restored. Such methods are also called loss-controlled compression methods. They are applicable only for those types of data for which the formal loss of part of the content does not lead to a significant decrease in consumer properties. First of all, this applies to multimedia data: video sequences, music recordings, sound recordings and drawings. Lossy compression methods typically provide much higher compression rates than reversible methods, but they cannot be applied to text documents, databases, or even program code. Typical lossy compression formats are:
- JPG for graphic data;
- .MPG for video data;
- . M RZ for audio data.
If data compression only changes its structure, then the compression method is reversible. From the resulting code, you can restore the original array by applying the reverse method. Reversible methods are used to compress any type of data. Typical lossless compression formats are:
- .GIF, TIP,. PCX and many others for graphics data;
- .AVI for video data;
- .ZIP, .ARJ, .BAR, .LZH, .LH, .CAB and many others for any data type.
The “classical” data compression formats, widely used in everyday computer work, are the .ZIP and .ARJ formats. Recently, the popular .RAR format has been added to them.
The basic functions that most modern archive managers perform include:
- extracting files from archives;
- creation of new archives;
- adding files to an existing archive;
- creation of self-extracting archives;
- creation of distributed archives on low-capacity media;
- testing the integrity of the archive structure;
- full or partial restoration of damaged archives;
- protection of archives from viewing and unauthorized modification.
Self-extracting archives. A self-extracting archive is prepared on the basis of a regular archive by attaching a small software module to it. The archive itself receives a name extension.EXE, which is typical for executable files.
Distributed archives. Some managers (for example WinZip) perform splitting directly onto floppy disks, and some (for example WinRAR and WinArj) allow you to pre-split the archive into fragments of a given size on the hard drive. Subsequently, they can be transferred to external media by copying.
When creating distributed archives, the WinZip manager has an unpleasant feature: each volume carries files with the same names. As a result, it is not possible to determine the volume numbers stored on each floppy disk by file name. WinArj and WinRAR archive managers label all distributed archive files with different names and therefore do not create such problems.
Archive protection. In most cases, archives are protected using a password, which is requested when you try to view, unpack or change the archive.
Additional functions of archive managers include service functions that make work more convenient. They are often implemented by externally connecting additional utilities and provide:
- viewing files of various formats without extracting them from the archive;
search for files and data inside archives;
installation of programs from archives without preliminary unpacking;
checking the absence of computer viruses in the archive before unpacking it;
cryptographic protection of archival information;
decoding email messages;
“transparent” compaction of executable files.EXE and.DLL;
creation of self-extracting multi-volume archives;
selecting or adjusting the information compression ratio.

File- a named set of data presented on a computer storage medium. The concept of a file applies primarily to data stored on disks, and therefore files are usually identified with areas of disk storage on these media.

File system includes rules for the formation of file names and ways to access them, a file table of contents system and a structure for storing files on disks.

The file has a name and attributes(archived, read-only, hidden, system), characterized by size in bytes, date and time of creation or last change.

The file name consists of two parts: the actual name and the extension (type). Type may be missing. The name is separated from the type by a dot character. In Windows, you can name files up to 255 characters long. The type indicates the type and purpose of the file, some of them are standard, for example:

· .COM and .EXE - executable files;

· .BAT - command batch file;

· .TXT - text file of any type;

· .MDB - Access database file;

· .XLS - Excel spreadsheet;

· .DOC - text file of the Microsoft Word editor;

· .ZIP - packed Winzip/PkZip archiver file.

The use of standard extensions makes it possible not to specify them when executing system programs and application packages, and the default principle is used.

Directory (folder, directory) - a named set of files combined based on belonging to the same software product or for other reasons. The expression “the file is included in the directory” or “the file is contained in the directory” means that information about this file is recorded in the area of \u200b\u200bthe disk related to that directory. Directory names follow the same rules as file names. Directories usually do not have an extension, although one can be assigned.

On each physical or logical disk there is root(head) directory that cannot be created, deleted, or renamed by user means. It is denoted by the character '\' (on some operating systems you can also use '/'). Other directories and files may be registered in the head directory. Subdirectories can, in turn, contain lower-level directories. This structure is called hierarchical system or tree directories in which the main directory forms the root of the tree and the remaining directories are like branches.

Grouping files into directories does not mean that they are grouped in any way in one place on the disk. Moreover, the same file can be “scattered” (fragmented) across the entire disk. Files with the same names can be located in several directories on the disk, but several files of the same name cannot be located in the same directory.

In order for the OS to access the file, you must specify:

· path along the directory tree;

· full file name.

This information is indicated in file specifications, which has the following format:

[drive:][path]filename[.type]

Square brackets indicate that the corresponding part of the specification can be omitted. In this case the value is used default.

If no drive is specified, the current drive is used. Current disk is the disk that the operating system is currently running on.

Path-sequence of folders that need to be navigated to the desired file. Names in the path are written in descending order of precedence and are separated by the "\" character. The directory that contains the current directory is called parental.

Quite often there is a need to process several files at once with one command. For example, delete all backup files with the BAK extension, or rewrite several document files with the names doc1.txt, doc2.txt, etc. In these cases, use special characters - masks, allowing you to describe a group of files with one name. There are only two masks:

· the * symbol in the file name or extension replaces any allowed number of characters;

· symbol? replaces any character or lack of character in a file name or extension.

Our examples will match the masks *.bak (all files with the bak extension) and doc?.txt (all files with the txt extension and a 4-character name starting with doc).

Questions on the topic submitted for testing:

1. Definition of OS. Basic Windows OS concepts (multitasking, graphical user interface, embedding and data binding).

2. Graphical user interface, its main components (windows, dialog tools, standard management of windows and dialog tools).

3. Working with the keyboard and mouse in Windows. Standard key combinations and mouse operations.

4. Working with files and folders in Windows - basic operations and capabilities. “My Computer” and “Explorer” programs.

5. Searching for information in Windows.

6. Create shortcuts to applications and documents.

7. Control panel and its main components.

8. Handling failures in Windows.

9. Setting up DOS applications for Windows.

INTRODUCTION

Currently, the most common personal computers (PCs) are based on the Pentium processor. Most of these PCs run the operating system (OS) Windows 95 or Windows 98 (Windows 9x or simply Windows). Windows is the de facto standard for 32-bit personal computers. To date, several versions of the system have already been developed.

An operating system (OS) is a set of programs that provide control of computer hardware, planning the efficient use of its resources and solving problems based on user tasks. The OS is loaded into the computer when it is turned on.

Distinctive features of modern operating systems, including Windows 9x, are:

Developed user interface, that is, means and methods of interaction with the user;

Multitasking – the ability to ensure the execution of several programs “simultaneously”;

Using all the capabilities provided by modern microprocessors;

Work stability and security.

Windows 9x is the successor and result of the merger of two systems: Windows 3.1x and MS-DOS. The developers had to make a number of compromises to ensure its compatibility with these systems:

Windows 9x starts functioning in real mode, and only then goes into protected mode;

Windows 9x is based on an updated MS-DOS;

Windows 9x has a sufficient number of 16-bit components (modules and device drivers).

Windows 9x is based on an object-oriented approach. Objects include documents, applications, folders, files, shortcuts, drives, etc. Opening an object– one of the main concepts in the system. The actions performed depend on the type of object:

- opening a document is to launch the appropriate application and loading a document into this application to allow it to be viewed, edited and printed. Instead of opening and loading a document, we can talk about opening and loading a file with a document, since all documents are stored in files;

- opening the application- putting it into operation;

- opening a folder consists of displaying its contents on the screen, which allows you to carry out any actions with the objects located in it;

- opening the input/output device allows you to get into the environment of the dispatcher who provides control of this device;

- opening a shortcut in many cases it is tantamount to opening the object for which it was created.

When processing a document, you can use both a procedural and an object-oriented approach. In the first case, you need to know which application should process the document. In another case, double-clicking a document or a shortcut created for it launches the application associated with it. If Windows doesn't know which application should process a given document, it will offer to associate the document with a specific application.

FILE SYSTEM COMPONENTS

Work on a PC occurs with various types of data. Data refers to everything that is subject to storage (programs in source or machine code, data for its operation, any text documents and numerical data, encoded tabular, graphic and other information).

File is a named collection of homogeneous information on an external medium (for example, on a magnetic disk).

IN file name(Windows 9x) Almost all printable characters can be used, but there are a number of restrictions:

There cannot be spaces at the beginning or end of the file name (they can be specified, but they will be ignored);

The file name cannot begin or end with a dot;

The following characters cannot be used in the file name: /, \, :, ?, '',<, >, |, since they are reserved for other purposes;

The file name length should not exceed (in general) 255 characters.

Such names are called long. For example, Laboratory work No. 1 in the operating systems discipline.

For each file, Windows 9x automatically generates a short a name that is formed based on the requirements of the MS-DOS operating system and is used to ensure compatibility of operating systems. It contains no more than 8 characters. In addition to the characters prohibited in long names, it is not allowed to use the symbols;, +, [, ], =, “dot”, “comma”, “space”. The short name begins like the long name, followed by the ~ symbol and a serial number (no more than 8 characters in total). In this case, prohibited characters are omitted, lowercase letters are recoded into uppercase ones. For example, PRIMER~1 can match a long filename beginning with the letters Primer. If there is another such file, its short name will be PRIMER~2.

The names reserved for I/O devices are prohibited: PRN (printer), CON (console, i.e. keyboard and monitor), NUL (dummy device), LPT1–LPT3 (first–third parallel port), COM1–COM3 ( first – third serial port). Latin characters A:, B:, C:, D:, etc. are called external storage devices.

If there is at least one period in the file name, then it is considered to have an extension, in accordance with the nature of the stored information. File name extension is the sequence of characters located after the last period specified in the name. The dot is treated as a name and extension separator. The extension is specified either by the user himself or by the program that generates the file. It is better to use standard 1-3 character extensions, as the file type becomes clear, for example:

BAT for command files;

DOC for files containing various documents in the Microsoft Word editor format;

PAS for programs written in PASCAL language; -

PCX for files with illustrations in the raster graphics editor Publishers Paintbrush format;

VAK for files with a previous version of the document (backup files);

EXE for files, with a ready-to-execute program

COM for files, with a program ready to be executed only in the MS-DOS environment.

Currently, for programs that are ready to run under the operating system, the term is used application(application), for example, Windows - application

Example file: COMMAND.COM, COMMAND - file name, COM - extension.

In addition to the long and short names, a number of properties are associated with each file. To the number file properties relate:

File attributes;

Date and time of its creation;

Date and time of file modification;

Date of last access to the file (read or write);

Length, or file size (in bytes).

File attributes determine how it can be used and access rights to it. In Windows 9x, attributes play an informational role rather than a protective one, as in the MS-DOS environment. A file can be assigned any combination of the following attributes:

Read-Only [R] (Read Only) - sets the file write protection, the file cannot be deleted, moved or modified without special measures;

Archive [A] (Archive) - sets the archive status for the file, is set automatically when creating or modifying the file, can be removed by archiving or backup tools;

Hidden [H] (Hidden) – hidden files, unless special measures are taken, are not shown in folders.

System [S] (System) – an attribute that is supplied to system files.

Each file in Windows 9x is associated with an icon that corresponds to the file type. Pictogram is a small illustration that helps you quickly identify the object with which it is associated.

Often, a filename pattern is used to designate multiple files at once or to shorten filenames. Template name is the name in which are used symbols - substitutes"*" And "?". The position where the "?" sign appears. , can contain any character. "*" means that the position in which "*" appears and all subsequent ones can be occupied by any symbols.

*.TXT - all files of TXT type;

A?.* - all files whose names begin with the letter A and consist of one or two letters.

1.2. Folders (directories)

As tasks grow, the number of files on the disk increases greatly and, even with skillfully chosen file names, it becomes difficult to keep track of the order on the disk and navigate through the files. A group of files on one medium, combined according to some criterion, can be stored in folder(folders). MS-DOS used the concept catalog or directories(directory). The analogy between folders and directories is not complete. Each directory can be considered a folder, but not every folder corresponds to a directory on the disk, and if it does, it may be located in a completely different place in the file structure. If a file name is stored in a folder (directory), then the file is said to be located in that directory. Each folder in Windows 9x has an icon and a name just like a file (but usually without an extension).

(Any) folder can be registered in another folder. Therefore, the file structure on disks is hierarchical multi-level or tree-like, at the root of which is main folder, or root directory(ROOT DIRECTORY) There is one such folder on each disk, which is indicated by the "\" symbol. The root directory is created when the disk is formatted and cannot be renamed or deleted. It should be noted that it is not customary to create folders on floppy magnetic disks.

If one folder is directly contained within another, then the first is called a child (subdirectory), and the second is called a parent (superdirectory) of the first folder. MS-DOS uses the ".." character to indicate the parent directory.

MS-DOS supports the concept current drive And current catalogs. Initially, the current drive is the drive from which the system was booted, and accordingly the directory. The directory that the user is currently working with is called the current directory. The current drive is determined in the same way. The current directory of the current drive is called workers. Windows also supports this concept, but in a slightly different way. For example, changing the working folder in applications occurs implicitly - when opening and saving documents.

An example of a fragment of a file structure on a disk is shown in Fig. 1.

Rice. 1

In Figure 1, the Documents directory is registered in the My folder directory, so Documents is said to be a subdirectory of My folder, and My folder is a superdirectory, or parent directory, of Documents.

Each folder (but not the main one), in the same way as a file, has a number of properties associated with it. Folders have the Directory (D) attribute set, which distinguishes it from a file, and is also associated with the date and time of creation.

If there is a branched structure of files on the disk, it is not enough to specify only its name to find a file (if you do not use high-level Windows tools). You must specify the route (path) to the file. Route is a sequence of directory names separated by the "\" character that specifies a route from the root (full route) or current directory of the disk to the one in which the desired file is located. Thus, full file name, or file specification has the following form:

[drive:][full_route\]name.type.

Square quotes denote optional parameters.

If the full name uses characters that are not allowed in short names (in an MS-DOS environment), the specification must be enclosed in quotation marks.

An example of a full file name: A:\PROGRAM\PASCAL\LAB.PAS.

For example, the DEMO.EXE file located in the PROGRAM subdirectory can be accessed:

DEMO.EXE, if the current directory is PROGRAM;

PROGRAM\DEMO.EXE, if the current directory is the root directory;

-..\demo.exe if the current directory is PASCAL.

1.3. Shortcuts

Windows 9x tools provide the creation of another file system component on disks—shortcuts. Label(shortcut) is a file containing a pointer (link) to some object in the resource tree - another file, folder or peripheral device. (The file structures of all available disks, as well as some input/output devices, are combined into resource tree.) One object can correspond to several shortcuts located in different folders. When you delete a shortcut, only the reference to the object is destroyed, which does not change in any way. Double-clicking a document's shortcut will implicitly launch the application associated with that document and load the document into it for processing. Most often, shortcuts are placed on the desktop to facilitate access to constantly used objects. The shortcut is named according to the same rules as the file, but it is assigned the standard extension LNK (from LiNK - connection). The icon of the shortcut matches the icon of the object for which the shortcut was created, but has a curved arrow in the lower left corner.

If a shortcut is created for an MS-DOS application or a batch file, then instead of the shortcut a file with the PIF extension is generated. In Windows 95, this file can be considered as a special kind of shortcut that refers to an executable file for the MS-DOS environment.

1.4. Desktop

After loading the Windows 9x system, the monitor screen shows Desktop(Desktop), (supposedly) the largest folder. The desktop itself is a system object, but unlike objects located on it, it cannot be moved or copied to any of them. Any objects from the resource tree can be placed on the desktop; usually it contains only standard (system) folders and shortcuts for those objects that are accessed most often.

Standard (system) folder is a folder created and maintained by Windows itself. Here are some of the standard folders located on the desktop:

The My computer folder is an image of the computer and allows you to access its resources. Having gained access to an object, you can perform the required operations on it or change its properties;

Folder Recycle Bin. Deleted files and shortcuts go into this folder so that they can be restored if necessary. The size of the basket is adjustable.

These two folders are required, the rest are not. Features of standard folders are (in most) cases the inability to delete them, rename them, have special properties, and have specific commands in context menus. From the point of view of Windows, the desktop is also a standard (system) folder.

Control questions:

1. What is a file, file name and extension, template?

2. What files are called executable?

3. What is a folder (directory), subdirectory, root and parent directory?

4. Which folders are standard?

5. Define the specification, or full file name.

6. What is a shortcut?

MS-DOS COMMANDS

Commands are run from the command line after receiving an invitation to work or from a batch file. The prompt is issued when the OS is ready for use.

MS-DOS command format:

command [options] .

Parameters from the command are separated by spaces. If the user does not include any parameters or switches in the commands, the system provides their default values. Key /? Issues help on a command. You can interrupt the execution of a command or program by pressing keys ; pause the display of information on the screen - , continue by pressing any key.

There are two types of MS-DOS commands: built-in (internal) and loadable (external). Built-in commands are the simplest, most frequently used, are an integral part of the command.com command processor and are not displayed in the catalog. (For example, DIR, COPY, DEL and others.) To downloadable commands include other commands that are permanently stored in files on disk (for example, FORMAT). Before you run these commands, you must make sure they exist on disk. Let's look at some MS-DOS commands.

3.1 To change the current drive, type the name of the drive that should become the current drive, then the symbol ":".

For example,

The command moves from drive A: to drive C:.

3.2 Changing the current directory

CD (CHDIR) [disc drive:] path

For example,

CD PROGRAM - transition to the PROGRAM subdirectory;

CD.. - goes to the parent directory.

3.3 Outputting a file to the screen.

TYPE [drive:][route\]name.type.

For example,

TYPE \PROGRAM\PASCAL\lab.txt ;

TYPE AUTOEXEC.BAT .

2.4 Deleting a file or group of files

DEL [drive:][route\]name.type.

This command allows the use of a wildcard.

For example,

DEL*.* - deletes all files in the current directory.

2.5 Browse directory

DIR [drive:][route\][name.type] .

For each file, the command reports its name, type, file size in bytes, creation date, and time the file was created or last updated. At the end, the amount of free space is reported. The ""/P "" key stops entering the contents of the directory as the screen fills; to continue entering, press any key. When using the "/W" key, only file names (and extensions) are displayed on the screen, 5 per line.

2.6 Creating a subdirectory

MD (MKDIR) [drive:] path

2.7 Deleting a subdirectory

RD (RMDIR) [drive:] path

Any subdirectory can be deleted with this command, but it must not contain any files or other subdirectories (to prevent file loss due to accidental erasure). Naturally, the current subdirectory and the main directory cannot be destroyed.

2.8 Renaming files

REN[drive:][route\]old_name new_name.

This command allows you to change the name of the corresponding file without changing its contents. The command allows the use of a template.

2.9 Cleaning the screen

2.10 Displaying the operating system version

When you enter this command, the operating system version number appears on the screen, which depends on the year the version was created. Knowing the version is necessary, since tools are being expanded year by year and commands and programs written for later versions will not work at all or will be executed differently.

2.11 Setting the current time

TIME [hh:mm:cc:dd]

This command sets the current time when loading MS-DOS or at any other time while working on the machine. When you run a command without parameters, the current time is displayed and a new one is requested by pressing the key , we can agree with the current time.

2.12 Setting the current date

DATE [mm:dd:yy]

The command sets the current date in the same way as the TIME command for setting the current time.

2.13 Browsing the subdirectory tree

This command displays a logical list of all subdirectories on the active disk. By adding the F key, you can also get a list of files contained in these subdirectories.

2.14 Copying individual files

The COPY command allows you to copy files from disk to disk, exchange data between peripheral devices, and merge data during the copy process.

COPY [drive:][route\]isf[drive:][route\][inf] ,

where isf is the name of the old file with the extension, inf is the name of the new file with the extension. The /V key allows you to make copies while checking the correctness of the copy. This command allows the use of a wildcard.

When using the COPY command to exchange information between peripheral devices, instead of file names, substitute special names CON, PRN, NIL, etc. into the command, which have the following meanings:

CON - console: keyboard for data entry, video display for displaying results and controlling dialogue;

PRN is the primary printer associated with your system;

NUL - pseudo-device (non-existent) for testing programs.

The COPY command allows you to combine multiple files into one with a "+" sign. With this combination (concatenation), the source files do not change, and the current time and date will be written to the new file.

1) COPY PASCAL\*.PAS B: ,

All files with the PAS type are copied from the PASCAL subdirectory to drive B:

2) COPY FILE.EXT PRN ,

Printing the FILE.EXT file.

3) COPY CON FILE.EXT ,

entering data from the keyboard into the file FILE.EXT, with the end of the file generated by a key combination (file creation in MS-DOS).

4) COPY FILE1.EXT+FILE2.EXT+FILE3.EXT BOOK.EXT ,

combining several files into one BOOK.EXT.

2.15 Write protection of files

ATTRIB [+R ¦ -R] [+A ¦-A] [ drive:][route\]filename.

R - sets file write protection;

R - cancels file write protection;

A - sets the file to archive status;

A - cancels the archive status of the file;

ATTRIB +R FILE.EXT - information cannot be written to this file;

ATTRIB FILE.EXT - a request is made about the ability to write data to FILE.EXT. Operating system response:

R_A:\FILE.EXT , i.e. The file is not writable.

2.16. Data forwarding:

> - redirect output data. Data that is always displayed on the screen is redirected to a peripheral device or disk file. In the latter case, the file is created if necessary. If the file exists, then the old data is replaced with new ones.

TYPE FILE.TXT > PRN

ECHO Group meeting tomorrow > PRN

>> - the output is also redirected, but if the file already exists, the data is appended to the old data.

< - переадресовать входные данные. Данные будут приниматься не с клавиатуры, а с периферийного устройства или из дискового файла.

PROGRAM< FILE.TXT

Note: The program whose execution we want to redirect must use standard I/O functions.

2.17. Organization of conveyors.

You can chain commands or programs so that the screen output of the first one is used as keyboard input for the next A1|A2|A3.

ECHO Y | DEL *.* >NUL - will automatically respond Y (Yes) to the "Are you sure..." prompt when deleting all directory items.

Occurs along (conveyor) | transferring data from one program to another. Much more effective use | (pipeline) with filter and forwarding commands.

2.18.Filters FIND, MORE, SORT.

a) Search for specified data in a disk file (phone number, address, any phrase):

FIND “phrase” [path\] file name,

where /C is the detection counter, i.e. how many times a phrase is detected, but the lines themselves are not displayed;

/N – the line number is also displayed (except for the line itself);

/V – displays all lines that do not contain this phrase.

FIND “group” FILE.TXT – displays a line from the file containing the word “group”.

DIR | FIND /V “COM” – displays all files except files with the COM extension.

FIND “car” AB.DAT, B.DAT, C.DAT – car expenses.

b) Page-by-page display

MORE< FILE.TXT

TYPE FILE.EXT | MORE

c) Sorting data.

SORT (default sort by 1 character alphabetically in ascending order),

where /R - sorting in descending order;

/+n – starting from column n, the row will be sorted.

entering information from the keyboard, Ù Z – sign of the end of the entered information.

It is advisable to write this to a file, i.e. SORT< CON >FILE.TXT.

DIR | SORT – directory elements are sorted by file (directory) names.

DIR | SORT /+10 > FILE.EXT -

the list of files will be ordered by extension (WINDOWS 9X).

One of the main tasks of the OS is to ensure the exchange of data between applications and computer peripheral devices. In modern operating systems, the functions of data exchange with peripheral devices are performed by input/output subsystems. The input/output subsystem includes drivers for controlling external devices and a file system.

To provide user convenience with data stored on disks, the OS replaces the physical organization of data with its logical model. Logical structure - a directory tree that is displayed on the screen by the Explorer program, etc.

File– a named area of external memory into which data can be written to and read from. Files are stored in power-independent memory, usually on magnetic disks. Data is organized into files for the purpose of long-term and reliable storage of information and for the purpose of sharing information. Attributes can be set for a file; in computer networks, access rights can be set.

The file system includes:

The collection of all files on a logical disk;

Data structures that are used to manage files - tables of free and used disk space, tables of file locations, etc.

System software tools that allow you to perform operations on files, such as creating, deleting, copying, moving, renaming, searching.

Each OS has its own file system.

File system functions:

Disk memory allocation;

Naming the file;

Mapping the file name to the corresponding physical address in external memory;

Providing access to data;

Data protection and recovery;

File types

File systems support several functionally different file types, which typically include:

Regular files, or simply files that contain arbitrary information that the user enters into them or that is created as a result of the operation of system or user programs. The contents of a regular file are determined by the application that works with it. Regular files are divided into two broad classes: executable and non-executable. The OS must be able to recognize its own executable file.

Catalogs– a special type of files that contain system help information about a set of files that are located in this directory (contains names and information about the files). From the user's point of view, directories allow you to organize the storage of data on disk. From an OS perspective, directories are used to manage files.

Special files are dummy files that correspond to I/O devices and are designed to execute I/O commands.

As a rule, the file system has a hierarchical structure, at the top of which there is a single root directory, the name of which is the same as the name of the logical drive, and levels are created by the fact that a lower-level directory is included in a higher-level directory.

Each file of any type has its own symbolic name, the rules for the formation of symbolic names are different in each OS. Hierarchically organized file systems use three types of names: simple or symbolic, full name or compound, and relative.

Simple name defines a file within the same directory. Files can have the same symbolic names if they are located in different directories. "Many files - one simple name."

Full name is a sequence of simple symbolic names of all directories through which the path from the root to a given file passes, and the file name itself. The fully qualified file name uniquely identifies the file on the file system. "One file - one full name"

Relative name file is defined through the concept of the current directory, that is, the directory in which the user is currently located. The file system captures the name of the current directory so that it can then use it as a complement to the relative name to form the fully qualified name. The user writes the file name starting from the current directory.

If the OS supports several external memory devices (hard drive, floppy drive, CD ROM), then file storage can be organized in two ways:

1. Each device hosts an autonomous (its own) file system, that is, the files located on this device are described by their directory tree as not being related to the directory tree of another device;

2. Mounting file systems (UNIX OS). The user has the opportunity to combine file systems located on different devices into a single file system, which will have a single directory tree.

File attributes– properties assigned to the file. Main attributes – Read Only, System, Hidden, Archive.

The OS file system must provide the user with a set of operations for working with files in the form system calls. This set includes system calls: create (create a file), read (read), write (write), close (close) and some others. When working with one file, as a rule, not one operation is performed, but a sequence. For example, when working in a text editor. Whatever operation is performed on a file, the OS must perform a number of actions that are universal for all operations:

1. Using the symbolic name of the file, find its characteristics, which are stored in the file system on the disk;

2. Copy the file characteristics to the OP;

3. Based on the file characteristics, check the access rights to perform the requested operation (read, write, delete);

4. After performing an operation with a file, clear the memory area allocated for temporary storage of file characteristics.

Working with a file begins with a system call OPEN, which copies file characteristics and checks permissions, and ends with a system call CLOSE, which frees the buffer with characteristics and makes it impossible to continue working with the file without reopening it.

File organization of data called the distribution of files across directories, directories across logical drives. Logical drive – Directory – File. The user has the opportunity to obtain information about the file organization of data.

The principles of placing files, directories and system information on a specific external memory device is called Physical organization of the file system.

General. In computer science theory, the following three main types of data structures are defined: linear, tabular, hierarchical. Example book: sequence of sheets - linear structure. Parts, sections, chapters, paragraphs - hierarchy. Table of contents – table – connects – hierarchical with linear. Structured data has a new attribute - Address. So:

Linear structures (lists, vectors). Regular lists. The address of each element is uniquely determined by its number. If all elements of the list have equal length – data vectors.

Tabular structures (tables, matrices). The difference between a table and a list - each element - is determined by an address, consisting of not one, but several parameters. The most common example is a matrix - address - two parameters - row number and column number. Multidimensional tables.

Hierarchical structures. Used to present irregular data. The address is determined by the route - from the top of the tree. File system - computer. (The route can exceed the data size, dichotomy - there are always two branches - left and right).

Ordering data structures. The main method is sorting. ! When adding a new element to an ordered structure, it is possible to change the address of existing ones. For hierarchical structures - indexing - each element has a unique number - which is then used in sorting and searching.

Basic elements of a file system

The historical first step in data storage and management was the use of file management systems.

A file is a named area of external memory that can be written to and read from. Three parameters:

sequence of an arbitrary number of bytes,

a unique proper name (actually an address).

data of the same type – file type.

The rules for naming files, how the data stored in a file is accessed, and the structure of that data depend on the particular file management system and possibly on the file type.

The first, in the modern sense, developed file system was developed by IBM for its 360 series (1965-1966). But in current systems it is practically not used. Used list data structures (EC-volume, section, file).

Most of you are familiar with the file systems of modern operating systems. This is primarily MS DOS, Windows, and some with file system construction for various UNIX variants.

File structure. A file represents a collection of data blocks located on external media. To exchange with a magnetic disk at the hardware level, you need to specify the cylinder number, surface number, block number on the corresponding track and the number of bytes that need to be written or read from the beginning of this block. Therefore, all file systems explicitly or implicitly allocate some basic level that ensures work with files that represent a set of directly addressable blocks in the address space.

Naming files. All modern file systems support multi-level file naming by maintaining additional files with a special structure - directories - in external memory. Each directory contains the names of the directories and/or files contained in that directory. Thus, the full name of a file consists of a list of directory names plus the name of the file in the directory immediately containing the file. The difference between the way files are named on different file systems is where the chain of names begins. (Unix, DOS-Windows)

File protection. File management systems must provide authorization for access to files. In general, the approach is that in relation to each registered user of a given computer system, for each existing file, actions that are allowed or prohibited for this user are indicated. There have been attempts to implement this approach in full. But this caused too much overhead both in storing redundant information and in using this information to control access eligibility. Therefore, most modern file management systems use the file protection approach first implemented in UNIX (1974). In this system, each registered user is associated with a pair of integer identifiers: the identifier of the group to which this user belongs, and his own identifier in the group. Accordingly, for each file, the full identifier of the user who created this file is stored, and it is noted what actions he himself can perform with the file, what actions with the file are available to other users of the same group, and what users of other groups can do with the file. This information is very compact, requires few steps during verification, and this method of access control is satisfactory in most cases.

Multi-user access mode. If the operating system supports multi-user mode, it is quite possible for two or more users to simultaneously try to work with the same file. If all these users are only going to read the file, nothing bad will happen. But if at least one of them changes the file, mutual synchronization is required for this group to work correctly. Historically, file systems have taken the following approach. In the operation of opening a file (the first and mandatory operation with which a session of working with a file should begin), among other parameters, the operating mode (reading or changing) was indicated. + there are special procedures for synchronizing user actions. Not allowed by records!

Journaling in file systems. General principles.

Running a system check (fsck) on large file systems can take a long time, which is unfortunate given today's high-speed systems. The reason why there is no integrity in the file system may be incorrect unmounting, for example, the disk was being written to at the time of termination. Applications could update the data contained in files, and the system could update file system metadata, which is “data about file system data,” in other words, information about which blocks are associated with which files, which files are located in which directories, and the like. . Errors (lack of integrity) in data files are bad, but much worse are errors in file system metadata, which can lead to file loss and other serious problems.

To minimize integrity issues and minimize system restart time, a journaled file system maintains a list of changes it will make to the file system before actually writing the changes. These records are stored in a separate part of the file system called a "journal" or "log". Once these journal (log) entries are securely written, the journaling file system makes these changes to the file system and then deletes these entries from the “log” (log). Log entries are organized into sets of related file system changes, much like the way changes added to a database are organized into transactions.

A journaled file system increases the likelihood of integrity because log file entries are made before changes are made to the file system, and because the file system retains those entries until they are fully and securely applied to the file system. When you reboot a computer that uses a journaled file system, the mount program can ensure the integrity of the file system by simply checking the log file for changes that were expected but not made and writing them to the file system. In most cases, the system does not need to check the integrity of the file system, which means that a computer using a journaled file system will be available for use almost immediately after a reboot. Accordingly, the chances of data loss due to problems in the file system are significantly reduced.

The classic form of a journaled file system is to store changes in file system metadata in a journal (log) and store changes to all file system data, including changes to the files themselves.

File system MS-DOS (FAT)

The MS-DOS file system is a tree-based file system for small disks and simple directory structures, with the root being the root directory and the leaves being files and other directories, possibly empty. Files managed by this file system are placed in clusters, the size of which can range from 4 KB to 64 KB in multiples of 4, without using the adjacency property in a mixed way to allocate disk memory. For example, the figure shows three files. The File1.txt file is quite large: it involves three consecutive blocks. The small file File3.txt uses the space of only one allocated block. The third file is File2.txt. is a large fragmented file. In each case, the entry point points to the first allocable block owned by the file. If a file uses multiple allocated blocks, the previous block points to the next one in the chain. The value FFF is identified with the end of the sequence.

FAT disk partition

To access files efficiently, use file allocation table– File Allocation Table, which is located at the beginning of the partition (or logical drive). It is from the name of the allocation table that the name of this file system – FAT – comes from. To protect the partition, two copies of the FAT are stored on it in case one of them becomes corrupted. In addition, file allocation tables must be placed at strictly fixed addresses so that the files necessary to start the system are located correctly.

The file allocation table consists of 16-bit elements and contains the following information about each logical disk cluster:

the cluster is not used;

the cluster is used by the file;

bad cluster;

last file cluster;.

Since each cluster must be assigned a unique 16-bit number, FAT therefore supports a maximum of 216, or 65,536 clusters on one logical disk (and also reserves some of the clusters for its own needs). Thus, we get the maximum disk size served by MS-DOS at 4 GB. The cluster size can be increased or decreased depending on the disk size. However, when the disk size exceeds a certain value, the clusters become too large, which leads to internal disk defragmentation. In addition to information about files, the file allocation table can also contain information about directories. This treats directories as special files with 32-byte entries for each file contained in that directory. The root directory has a fixed size - 512 entries for a hard disk, and for floppy disks this size is determined by the size of the floppy disk. Additionally, the root directory is located immediately after the second copy of the FAT because it contains the files needed by the MS-DOS boot loader.

When searching for a file on a disk, MS-DOS is forced to look through the directory structure to find it. For example, to run the executable file C:\Program\NC4\nc.exe finds the executable file by doing the following:

reads the root directory of the C: drive and looks for the Program directory in it;

reads the initial cluster Program and looks in this directory for an entry about the NC4 subdirectory;

reads the initial cluster of the NC4 subdirectory and looks for an entry for the nc.exe file in it;

reads all clusters of the nc.exe file.

This search method is not the fastest among current file systems. Moreover, the greater the depth of the directories, the slower the search will be. To speed up the search operation, you should maintain a balanced file structure.

Advantages of FAT

It is the best choice for small logical drives, because... starts with minimal overhead. On disks whose size does not exceed 500 MB, it works with acceptable performance.

Disadvantages of FAT

Since the size of a file entry is limited to 32 bytes, and the information must include the file size, date, attributes, etc., the size of the file name is also limited and cannot exceed 8+3 characters for each file. The use of so-called short file names makes FAT less attractive to use than other file systems.

Using FAT on disks larger than 500 MB is irrational due to disk defragmentation.

The FAT file system does not have any security features and supports minimal information security capabilities.

The speed of operations in FAT is inversely proportional to the depth of directory nesting and disk space.

UNIX file system - systems (ext3)

The modern, powerful and free Linux operating system provides a wide area for the development of modern systems and custom software. Some of the most interesting developments in recent Linux kernels are new, high-performance technologies for managing the storage, placement, and updating of data on disk. One of the most interesting mechanisms is the ext3 file system, which has been integrated into the Linux kernel since version 2.4.16, and is already available by default in Linux distributions from Red Hat and SuSE.

The ext3 file system is a journaling file system, 100% compatible with all utilities created to create, manage and fine-tune the ext2 file system, which has been used on Linux systems for the last several years. Before describing in detail the differences between the ext2 and ext3 file systems, let us clarify the terminology of file systems and file storage.

At the system level, all data on a computer exists as blocks of data on some storage device, organized using special data structures into partitions (logical sets on a storage device), which in turn are organized into files, directories and unused (free) space.

File systems are created on disk partitions to simplify the storage and organization of data in the form of files and directories. Linux, like the Unix system, uses a hierarchical file system made up of files and directories, which respectively contain either files or directories. Files and directories in a Linux file system are made available to the user by mounting them (the "mount" command), which is usually part of the system boot process. The list of file systems available for use is stored in the /etc/fstab file (FileSystem TABle). The list of file systems not currently mounted by the system is stored in the /etc/mtab (Mount TABle) file.

When a filesystem is mounted during boot, a bit in the header (the "clean bit") is cleared, indicating that the filesystem is in use, and that the data structures used to control the placement and organization of files and directories within that filesystem can be changed.

A file system is considered complete if all data blocks in it are either in use or free; each allocated data block is occupied by one and only one file or directory; all files and directories can be accessed after processing a series of other directories in the file system. When a Linux system is deliberately shut down using operator commands, all file systems are unmounted. Unmounting a file system during shutdown sets a "clean bit" in the file system header, indicating that the file system was properly unmounted and can therefore be considered intact.

Years of file system debugging and redesign and the use of improved algorithms for writing data to disk have greatly reduced data corruption caused by applications or the Linux kernel itself, but eliminating corruption and data loss due to power outages and other system problems is still a challenge. In the event of a crash or a simple shutdown of a Linux system without using standard shutdown procedures, the “clean bit” is not set in the file system header. The next time the system boots, the mount process detects that the system is not marked as "clean" and physically checks its integrity using the Linux/Unix file system check utility "fsck" (File System CheckK).

There are several journaling file systems available for Linux. The most famous of them are: XFS, a journaling file system developed by Silicon Graphics, but now released as open source; RaiserFS, a journaling file system designed specifically for Linux; JFS, a journaling file system originally developed by IBM but now released as open source; ext3 is a file system developed by Dr. Stephan Tweedie at Red Hat, and several other systems.

The ext3 file system is a journaled Linux version of the ext2 file system. The ext3 file system has one significant advantage over other journaling file systems - it is fully compatible with the ext2 file system. This makes it possible to use all existing applications designed to manipulate and customize the ext2 file system.

The ext3 filesystem is supported by Linux kernels version 2.4.16 and later, and must be enabled using the Filesystems Configuration dialog when building the kernel. Linux distributions such as Red Hat 7.2 and SuSE 7.3 already include native support for the ext3 file system. You can only use the ext3 filesystem if ext3 support is built into your kernel and you have the latest versions of the "mount" and "e2fsprogs" utilities.

In most cases, converting file systems from one format to another entails backing up all contained data, reformatting the partitions or logical volumes containing the file system, and then restoring all data to that file system. Due to the compatibility of the ext2 and ext3 file systems, all these steps do not need to be carried out, and the translation can be done using a single command (run with root privileges):

# /sbin/tune2fs -j<имя-раздела >

For example, converting an ext2 file system located on the /dev/hda5 partition to an ext3 file system can be done using the following command:

# /sbin/tune2fs -j /dev/hda5

The "-j" option to the "tune2fs" command creates an ext3 journal on an existing ext2 filesystem. After converting the ext2 file system to ext3, you must also make changes to the /etc/fstab file entries to indicate that the partition is now an "ext3" file system. You can also use auto detection of the partition type (the “auto” option), but it is still recommended to explicitly specify the file system type. The following example /etc/fstab file shows the changes before and after a file system transfer for the /dev/hda5 partition:

/dev/ hda5 /opt ext2 defaults 1 2

/dev/ hda5 /opt ext3 defaults 1 0

The last field in /etc/fstab specifies the step in the boot process during which the integrity of the file system should be checked using the "fsck" utility. When using ext3 file system, you can set this value to "0" as shown in the previous example. This means that the "fsck" program will never check the integrity of the filesystem, due to the fact that the integrity of the filesystem is guaranteed by rolling back the journal.

Converting the root file system to ext3 requires a special approach, and is best done in single user mode after creating a RAM disk that supports the ext3 file system.

In addition to being compatible with ext2 file system utilities and easy file system translation from ext2 to ext3, the ext3 file system also offers several different types of journaling.

The ext3 file system supports three different journaling modes that can be activated from the /etc/fstab file. These logging modes are as follows:

Journal - records all changes to file system data and metadata. The slowest of all three logging modes. This mode minimizes the chance of losing file changes you make to the file system.

Sequential/ordered – Writes changes to filesystem metadata only, but writes file data updates to disk before changes to associated filesystem metadata. This ext3 logging mode is installed by default.

Writeback - only changes to file system metadata are written, based on the standard process for writing changes to file data. This is the fastest logging method.

The differences between these logging modes are both subtle and profound. Using journal mode requires the ext3 file system to write every change to the file system twice - first to the journal and then to the file system itself. This can reduce the overall performance of your file system, but this mode is most loved by users because it minimizes the chance of losing data changes to your files, since both meta data changes and file data changes are written to the ext3 log and can be repeated when the system is rebooted.

Using the "sequential" mode, only changes to file system metadata are recorded, which reduces the redundancy between writing to the file system and to the journal, which is why this method is faster. Although changes to file data are not written to the journal, they must be made before changes to the associated filesystem metadata are made by the ext3 journaling daemon, which may slightly reduce the performance of your system. Using this journaling method ensures that files on the file system are never out of sync with the associated file system metadata.

The writeback method is faster than the other two journaling methods because it only stores changes to file system metadata, and does not wait for the file's associated data to change on write (before updating things like file size and directory information). Since file data is updated asynchronously with respect to journaled changes to the file system's metadata, files in the file system may show errors in the metadata, for example, an error in indicating the owner of data blocks (the update of which was not completed at the time the system was rebooted). This is not fatal, but may interfere with the user's experience.

Specifying the journaling mode used on an ext3 file system is done in the /etc/fstab file for that file system. "Sequential" mode is the default, but you can specify different logging modes by changing the options for the desired partition in the /etc/fstab file. For example, an entry in /etc/fstab indicating use of the writeback logging mode would look like this:

/dev/hda5 /opt ext3 data=writeback 1 0

Windows NT Family File System (NTFS)

Physical structure of NTFS

Let's start with general facts. An NTFS partition, in theory, can be almost any size. Of course, there is a limit, but I won’t even indicate it, since it will be sufficient for the next hundred years of development of computer technology - at any growth rate. How does this work in practice? Almost the same. The maximum size of an NTFS partition is currently limited only by the size of the hard drives. NT4, however, will experience problems when trying to install on a partition if any part of it is more than 8 GB from the physical beginning of the disk, but this problem only affects the boot partition.

Lyrical digression. The method of installing NT4.0 on an empty disk is quite original and can lead to the wrong thoughts about the capabilities of NTFS. If you tell the installer that you want to format the drive to NTFS, the maximum size it will offer you is only 4GB. Why so small if the size of an NTFS partition is actually practically unlimited? The fact is that the installation section simply does not know this file system :) The installation program formats this disk into a regular FAT, the maximum size of which in NT is 4 GB (using a not quite standard huge 64 KB cluster), and NT installs on this FAT . But already during the first boot of the operating system itself (still in the installation phase), the partition is quickly converted to NTFS; so the user does not notice anything except the strange “limitation” on the NTFS size during installation. :)

Section structure - general view

Like any other system, NTFS divides all useful space into clusters - blocks of data used at a time. NTFS supports almost any cluster size - from 512 bytes to 64 KB, while a 4 KB cluster is considered a certain standard. NTFS does not have any anomalies in the cluster structure, so there is not much to say on this, in general, rather banal topic.

An NTFS disk is conventionally divided into two parts. The first 12% of the disk is allocated to the so-called MFT zone - the space into which the MFT metafile grows (more on this below). It is not possible to write any data to this area. The MFT zone is always kept empty - this is done so that the most important service file (MFT) does not become fragmented as it grows. The remaining 88% of the disk is normal file storage space.

Free disk space, however, includes all physically free space - unfilled pieces of the MFT zone are also included there. The mechanism for using the MFT zone is as follows: when files can no longer be written to regular space, the MFT zone is simply reduced (in current versions of operating systems by exactly half), thus freeing up space for writing files. When space is freed up in the regular MFT area, the area may expand again. At the same time, it is possible that ordinary files remain in this zone: there is no anomaly here. Well, the system tried to keep her free, but nothing worked. Life goes on... The MFT metafile may still become fragmented, although this would be undesirable.

MFT and its structure

The NTFS file system is an outstanding achievement of structuring: every element of the system is a file - even service information. The most important file on NTFS is called MFT, or Master File Table - a general table of files. It is located in the MFT zone and is a centralized directory of all other disk files, and, paradoxically, itself. The MFT is divided into fixed-size entries (usually 1 KB), and each entry corresponds to a file (in the general sense of the word). The first 16 files are of a service nature and are inaccessible to the operating system - they are called metafiles, with the very first metafile being the MFT itself. These first 16 MFT elements are the only part of the disk that has a fixed position. Interestingly, the second copy of the first three records, for reliability (they are very important), is stored exactly in the middle of the disk. The rest of the MFT file can be located, like any other file, in arbitrary places on the disk - you can restore its position using the file itself, “hooking” on the very basis - the first MFT element.

Metafiles

The first 16 NTFS files (metafiles) are of a service nature. Each of them is responsible for some aspect of the system's operation. The advantage of such a modular approach is its amazing flexibility - for example, on FAT, physical damage in the FAT area itself is fatal to the functioning of the entire disk, and NTFS can shift, even fragment across the disk, all of its service areas, bypassing any surface faults - except for the first 16 MFT elements.

Metafiles are located in the root directory of an NTFS disk - they begin with the name symbol "$", although it is difficult to obtain any information about them using standard means. It is curious that these files also have a very real size indicated - you can find out, for example, how much the operating system spends on cataloging your entire disk by looking at the size of the $MFT file. The following table shows the currently used metafiles and their purpose.


	a copy of the first 16 MFT records placed in the middle of the disk
	logging support file (see below)
	service information - volume label, file system version, etc.
	list of standard file attributes on the volume
	root directory
	volume free space map
	boot sector (if the partition is bootable)
	a file that records user rights to use disk space (started to work only in NT5)
	file - a table of correspondence between uppercase and lowercase letters in file names on the current volume. It is needed mainly because in NTFS file names are written in Unicode, which amounts to 65 thousand different characters, searching for large and small equivalents of which is very non-trivial.

Files and streams

So, the system has files - and nothing but files. What does this concept include on NTFS?

First of all, a mandatory element is recording in MFT, because, as mentioned earlier, all disk files are mentioned in MFT. All information about the file is stored in this place, with the exception of the data itself. File name, size, location on disk of individual fragments, etc. If one MFT record is not enough for information, then several are used, and not necessarily in a row.

Optional element - file data streams. The definition of “optional” may seem strange, but, nevertheless, there is nothing strange here. Firstly, the file may not have data - in this case, it does not consume the free space of the disk itself. Secondly, the file may not be very large. Then a rather successful solution comes into play: the file data is stored directly in the MFT, in the space remaining from the main data within one MFT record. Files that occupy hundreds of bytes usually do not have their “physical” embodiment in the main file area - all the data of such a file is stored in one place - in the MFT.

The situation with the file data is quite interesting. Each file on NTFS, in general, has a somewhat abstract structure - it does not have data as such, but there are streams. One of the streams has the meaning we are familiar with - file data. But most file attributes are also streams! Thus, it turns out that the file has only one basic entity - the number in MFT, and everything else is optional. This abstraction can be used to create quite convenient things - for example, you can “attach” another stream to a file by writing any data into it - for example, information about the author and contents of the file, as is done in Windows 2000 (the rightmost tab in the file properties, viewed from Explorer). Interestingly, these additional streams are not visible by standard means: the observed file size is only the size of the main stream that contains the traditional data. You can, for example, have a file of zero length, which, when erased, will free up 1 GB of free space - simply because some cunning program or technology has stuck an additional gigabyte-sized stream (alternative data) in it. But in fact, at the moment, threads are practically not used, so one should not be afraid of such situations, although hypothetically they are possible. Just keep in mind that a file on NTFS is a deeper and more global concept than one might imagine by simply browsing through the disk's directories. And finally: the file name can contain any characters, including the entire set of national alphabets, since the data is presented in Unicode - a 16-bit representation that gives 65535 different characters. The maximum file name length is 255 characters.

Catalogs

An NTFS directory is a specific file that stores links to other files and directories, creating a hierarchical structure of data on the disk. The catalog file is divided into blocks, each of which contains the file name, basic attributes and a link to the MFT element, which already provides complete information about the catalog element. The internal directory structure is a binary tree. Here's what this means: to find a file with a given name in a linear directory, such as a FAT, the operating system has to look through all the elements of the directory until it finds the right one. A binary tree arranges file names in such a way that searching for a file is carried out in a faster way - by obtaining two-digit answers to questions about the location of the file. The question that a binary tree can answer is: in which group, relative to a given element, is the name you are looking for - above or below? We start with such a question to the middle element, and each answer narrows the search area by an average of two times. The files are, say, simply sorted alphabetically, and the question is answered in the obvious way - by comparing the initial letters. The search area, narrowed by half, begins to be explored in a similar way, starting again from the middle element.

Conclusion - to search for one file among 1000, for example, FAT will have to make an average of 500 comparisons (it is most likely that the file will be found in the middle of the search), and a tree-based system will have to make only about 10 (2^10 = 1024). Search time savings are obvious. However, you should not think that in traditional systems (FAT) everything is so neglected: firstly, maintaining a list of files in the form of a binary tree is quite labor-intensive, and secondly, even FAT performed by a modern system (Windows2000 or Windows98) uses similar optimization search. This is just another fact to add to your knowledge base. I would also like to dispel the common misconception (which I myself shared quite recently) that adding a file to a directory in the form of a tree is more difficult than to a linear directory: these are quite comparable operations in time - the fact is that in order to add a file to the directory, you first need to make sure that a file with that name is not there yet :) - and here in a linear system we will have the difficulties with finding a file, described above, which more than compensate for the very simplicity of adding a file to the directory.

What information can be obtained by simply reading a catalog file? Exactly what the dir command produces. To perform simple disk navigation, you don’t need to go into MFT for each file, you just need to read the most general information about files from directory files. The main directory of the disk - the root - is no different from ordinary directories, except for a special link to it from the beginning of the MFT metafile.

Logging

NTFS is a fault-tolerant system that can easily restore itself to a correct state in the event of almost any real failure. Any modern file system is based on the concept of a transaction - an action performed entirely and correctly or not performed at all. NTFS simply does not have intermediate (erroneous or incorrect) states - the quantum of data change cannot be divided into before and after the failure, bringing destruction and confusion - it is either committed or canceled.

Example 1: data is being written to disk. Suddenly it turns out that it was not possible to write to the place where we had just decided to write the next portion of data - physical damage to the surface. The behavior of NTFS in this case is quite logical: the write transaction is rolled back entirely - the system realizes that the write was not performed. The location is marked as failed, and the data is written to another location - a new transaction begins.

Example 2: a more complex case - data is being written to disk. Suddenly, bang - the power is turned off and the system reboots. At what phase did the recording stop, where is the data, and where is nonsense? Another system mechanism comes to the rescue - the transaction log. The fact is that the system, realizing its desire to write to disk, marked this state in the $LogFile metafile. When rebooting, this file is examined for the presence of unfinished transactions that were interrupted by an accident and the result of which is unpredictable - all these transactions are canceled: the place where the write was made is marked again as free, indexes and MFT elements are returned to the state in which they were before failure, and the system as a whole remains stable. Well, what if an error occurred while writing to the log? It’s also okay: the transaction either hasn’t started yet (there is only an attempt to record the intentions to carry it out), or it has already ended - that is, there is an attempt to record that the transaction has actually already been completed. In the latter case, at the next boot, the system itself will fully understand that in fact everything was written correctly anyway, and will not pay attention to the “unfinished” transaction.

Still, remember that logging is not an absolute panacea, but only a means to significantly reduce the number of errors and system failures. It is unlikely that the average NTFS user will ever notice a system error or be forced to run chkdsk - experience shows that NTFS is restored to a completely correct state even in case of failures at moments very busy with disk activity. You can even optimize the disk and press reset in the middle of this process - the likelihood of data loss even in this case will be very low. It is important to understand, however, that the NTFS recovery system guarantees the correctness of the file system, not your data. If you were writing to a disk and got a crash, your data may not be written. There are no miracles.

NTFS files have one quite useful attribute - "compressed". The fact is that NTFS has built-in support for disk compression - something for which you previously had to use Stacker or DoubleSpace. Any file or directory can be individually stored on disk in compressed form - this process is completely transparent to applications. File compression has a very high speed and only one big negative property - the huge virtual fragmentation of compressed files, which, however, does not really bother anyone. Compression is carried out in blocks of 16 clusters and uses so-called “virtual clusters” - again an extremely flexible solution that allows you to achieve interesting effects - for example, half of the file can be compressed, and half cannot. This is achieved due to the fact that storing information about the compression of certain fragments is very similar to regular file fragmentation: for example, a typical record of the physical layout for a real, uncompressed file:

file clusters from 1 to 43 are stored in disk clusters starting from 400, file clusters from 44 to 52 are stored in disk clusters starting from 8530...

Physical layout of a typical compressed file:

file clusters from 1 to 9 are stored in disk clusters starting from 400 file clusters from 10 to 16 are not stored anywhere file clusters from 17 to 18 are stored in disk clusters starting from 409 file clusters from 19 to The 36th is not stored anywhere....

It can be seen that the compressed file has “virtual” clusters, in which there is no real information. As soon as the system sees such virtual clusters, it immediately understands that the data from the previous block, a multiple of 16, must be decompressed, and the resulting data will just fill the virtual clusters - that, in fact, is the whole algorithm.

Safety

NTFS contains many means of delineating the rights of objects - it is believed that this is the most advanced file system of all currently existing. In theory, this is undoubtedly true, but in current implementations, unfortunately, the system of rights is quite far from ideal and, although rigid, is not always a logical set of characteristics. The rights assigned to any object and clearly respected by the system are evolving - major changes and additions to rights have been made several times already, and by Windows 2000 they have finally arrived at a fairly reasonable set.

The rights of the NTFS file system are inextricably linked with the system itself - that is, they, generally speaking, are not required to be respected by another system if it is given physical access to the disk. To prevent physical access, Windows 2000 (NT5) still introduced a standard feature - see below for more on this. The system of rights in its current state is quite complex, and I doubt that I can tell the general reader anything interesting and useful to him in everyday life. If you are interested in this topic, you will find many books on NT network architecture that describe this in more detail.

At this point the description of the structure of the file system can be completed; it remains to describe only a certain number of simply practical or original things.

This thing has been in NTFS since time immemorial, but was used very rarely - and yet: Hard Link is when the same file has two names (several file-directory pointers or different directories point to the same MFT record) . Let's say the same file has the names 1.txt and 2.txt: if the user deletes file 1, file 2 will remain. If he deletes 2, file 1 will remain, that is, both names, from the moment of creation, are completely equal. The file is physically erased only when its last name is deleted.

Symbolic Links (NT5)

A much more practical feature that allows you to create virtual directories - exactly the same as virtual disks using the subst command in DOS. The applications are quite varied: firstly, simplifying the catalog system. If you don't like the Documents and settings\Administrator\Documents directory, you can link it to the root directory - the system will still communicate with the directory with a wild path, and you will have a much shorter name that is completely equivalent to it. To create such connections, you can use the junction program (junction.zip(15 Kb), 36 kb), written by the famous specialist Mark Russinovich (http://www.sysinternals.com). The program only works in NT5 (Windows 2000), as does the feature itself. To remove a connection, you can use the standard rd command. WARNING: Attempting to delete a link using Explorer or other file managers that do not understand the virtual nature of a directory (such as FAR) will delete the data referenced by the link! Be careful.

Encryption (NT5)

A useful feature for people who are concerned about their secrets - each file or directory can also be encrypted, making it impossible for another NT installation to read it. Combined with a standard and virtually unbreakable password for booting the system itself, this feature provides sufficient security for most applications for the important data you select.