Hadoop-Let us Admin

MetaData

Because of the relatively low amount of metadata per file (it only tracks filenames, permissions, and the locations of each block), the NameNode stores all of the metadata in the main memory, thus allowing for a fast random access. The metadata storage is designed to be compact. As a result, a NameNode with 4 GB of RAM is capable of supporting a huge number of files and directories.

Modern distributed and parallel file systems such as pNFS , PVFS, HDFS, and GoogleFS treat metadata services as an independent system component, separately from data servers. A reason behind this separation is to ensure that metadata access does not obstruct the data access path. Another reason is design simplicity and the ability to scale the two parts of the system independently.

Files and directories are represented on the NameNode by inodes, which record attributes like permissions, modification and access times, namespace and disk space quotas. The NameNode maintains the file system namespace. Any change to the file system namespace or its properties is recorded by the NameNodeHDFS keeps the entire namespace in RAM.

Metadata are the most important management information replicated for namenode failover. In our solution, the metadata include initial metadata which are replicated in initialization phase and two types of runtime metadata which are replicated in replication phase. The initial metadata include two types of files: version file which contains the version information of running HDFS and file system image (fsimage) file which is a persistent checkpoint of the file system. Both files are replicated only once in initialization phase, because their replication are time-intensive processes. Slave node updates fsimage file based on runtime metadata to make the file catch up with that of primary node

The name node has an in-memory data structure called FsImage that contains the entire file system namespace and maps the files on to blocks. The NameNode stores all Metadata in a file called FsImage.