Discussion:FAQ new config : Différence entre versions

De ClustersSophia
Aller à : navigation, rechercher
(mksquashfs example)
(+ Some mksquashfs hints, no /usr/share/doc/squashfs-tools/README on centos :-()
Ligne 19 : Ligne 19 :
 
* in your jobs:
 
* in your jobs:
 
** mount those images using '''sudo mountimg'''
 
** mount those images using '''sudo mountimg'''
** use those mounted directories for processing but generate new file on the local filesystems of the node (ex: /tmp)
+
** use those mounted directories for processing but generate new files on the local filesystems of the node (ex: /tmp)
 
** unmount the images with '''sudo mounting -u'''
 
** unmount the images with '''sudo mounting -u'''
 
** add the new files to the images with '''mksquashfs-no-compression'''
 
** add the new files to the images with '''mksquashfs-no-compression'''
Ligne 28 : Ligne 28 :
 
** convert them to squashfs images under /data with '''mksquashfs-no-compression'''
 
** convert them to squashfs images under /data with '''mksquashfs-no-compression'''
  
 +
=== Creating squashfs images ===
 
To convert your zotfiles to images, choose first the granularity
 
To convert your zotfiles to images, choose first the granularity
 
apropriate to your case.
 
apropriate to your case.
Ligne 50 : Ligne 51 :
 
   done
 
   done
  
then in your jobs:
+
then in your jobs, if you need to mount all those images:
  
   cd /data/.../DDD/DD-mnt
+
   cd /data/.../DDD/DD-mnt || exit
 
   for i in *; do
 
   for i in *; do
     sudo mounting ../DD-img/$i.squashfs $i
+
     sudo mounting ../DD-img/$i.squashfs $i || exit
 
   done
 
   done
  
mksquashfs-no-compression is a simple wrapper to mksquashfs that
+
Some mksquashfs hints:
 +
* if the destination image exist, the source files/directories will be added (appended) to the image.
 +
** use the '''-noappend''' if you want to re-create completely the image, or remove it first.
 +
* If a single directory is specified (i.e. mksquashfs source output_fs) the squashfs filesystem will consist of that directory, with the top-level root directory corresponding to the source directory.
 +
** use the '''-keep-as-directory''' option to circumvent that.
 +
* If multiple source directories or files are specified, mksquashfs will merge the specified sources into a single filesystem, with the root directory containing each of the source files/directories.  The name of each directory entry will be the basename of the source path.  If more than one source entry maps to the same name, the conflicts are named xxx_1, xxx_2, etc. where xxx is the original name.
 +
 
 +
'''mksquashfs-no-compression''' is a simple wrapper to mksquashfs that
 
disable any kind of compression to focus on speed. Feel free to try
 
disable any kind of compression to focus on speed. Feel free to try
mksquashfs with other options like '''-comp lzo''' to save disk space.
+
'''mksquashfs''' directly with other options like '''-comp lzo''' to
 +
save disk space.
  
 
Refs:
 
Refs:
 
* man mksquashfs
 
* man mksquashfs
* /usr/share/doc/squashfs-tools/README
+
* sudo mountimg --help

Version du 11 février 2019 à 14:08

Draft squashfs/mountimg

How can I use many small files efficiently?

You can gain in performance and minimize the pressure under /data in the following cases:

  • case1 your jobs are only reading under the directories where your zotfiles reside
  • case2 your jobs are reading your zotfiles but add new files in them
  • case3 your jobs generate zotfiles, but they will be accessed only for reading or adding new files afterwards

For case1:

  • convert your zotfiles directories to squashfs images
  • in your jobs:
    • mount those images using sudo mountimg
    • use those mounted directories for processing

For case2:

  • convert your zotfiles directories to squashfs images
  • in your jobs:
    • mount those images using sudo mountimg
    • use those mounted directories for processing but generate new files on the local filesystems of the node (ex: /tmp)
    • unmount the images with sudo mounting -u
    • add the new files to the images with mksquashfs-no-compression

For case3:

  • in your jobs:
    • generates your zotfiles on the local filesystems of the node (ex: /tmp)
    • convert them to squashfs images under /data with mksquashfs-no-compression

Creating squashfs images

To convert your zotfiles to images, choose first the granularity apropriate to your case.

sudo mounting allows actually to mount at most 4000 images on a node.

If you have for example a really big directoy /data/.../DDD/DD/ containing hundreds of sub-directories D1 D2 ... DN, you may prefer to make one image per such directory.

Example (in bash):

 cd /data/.../DDD
 # Build a separate directory for the images and the mountpoints
 mkdir DD-img DD-mnt
 cd DD
 for i in D*; do
   # Create the image
   mksquashfs-no-compression $i ../DD-img/$i.squashfs
   # Create the mountpoint for your jobs
   mkdir ../DD-mnt/$i
 done

then in your jobs, if you need to mount all those images:

 cd /data/.../DDD/DD-mnt || exit
 for i in *; do
   sudo mounting ../DD-img/$i.squashfs $i || exit
 done

Some mksquashfs hints:

  • if the destination image exist, the source files/directories will be added (appended) to the image.
    • use the -noappend if you want to re-create completely the image, or remove it first.
  • If a single directory is specified (i.e. mksquashfs source output_fs) the squashfs filesystem will consist of that directory, with the top-level root directory corresponding to the source directory.
    • use the -keep-as-directory option to circumvent that.
  • If multiple source directories or files are specified, mksquashfs will merge the specified sources into a single filesystem, with the root directory containing each of the source files/directories. The name of each directory entry will be the basename of the source path. If more than one source entry maps to the same name, the conflicts are named xxx_1, xxx_2, etc. where xxx is the original name.

mksquashfs-no-compression is a simple wrapper to mksquashfs that disable any kind of compression to focus on speed. Feel free to try mksquashfs directly with other options like -comp lzo to save disk space.

Refs:

  • man mksquashfs
  • sudo mountimg --help