Discussion:FAQ new config : Différence entre versions

De ClustersSophia
Aller à : navigation, rechercher
(check spell)
(Fin draft squashfs/mountimg)
 
Ligne 1 : Ligne 1 :
Draft squashfs/mountimg
 
  
== How can I use many small files efficiently? ==
 
 
You can gain in performance and minimize the pressure under '''/data''' in
 
the following  cases:
 
* '''case1''' your jobs are only reading under the directories where your zotfiles reside
 
* '''case2''' your jobs are reading your zotfiles but add only new files or directories in them
 
* '''case3''' your jobs generate zotfiles, but they will be accessed only for reading or adding new files afterwards
 
 
For '''case1''':
 
* convert your zotfiles directories to squashfs images
 
* in your jobs:
 
** mount those images using '''sudo mountimg'''
 
** use those mounted directories for processing
 
 
For '''case2''':
 
* convert your zotfiles directories to squashfs images
 
* in your jobs:
 
** mount those images using '''sudo mountimg'''
 
** use those mounted directories for processing but generate new files on the local filesystems of the node (ex: /tmp)
 
** unmount the images with '''sudo mountimg -u'''
 
** add the new files to the images with '''mksquashfs-no-compression'''
 
 
For '''case3''':
 
* in your jobs:
 
** generates your zotfiles on the local filesystems of the node (ex: /tmp)
 
** convert them to squashfs images under '''/data''' with '''mksquashfs-no-compression'''
 
 
=== Creating squashfs images ===
 
 
You can convert your zotfiles on '''nef-devel''' or '''nef-devel2'''.
 
 
To convert your zotfiles to images, choose first the granularity
 
appropriate to your case.
 
 
'''sudo mountimg''' allows actually to mount at most 4000 images on a node.
 
 
If you have for example a really big directory /data/.../DDD/DD/
 
containing hundreds of sub-directories D1 D2 ... DN, you may prefer to
 
make one image per such sub-directory.
 
 
Example (in bash):
 
 
  cd /data/.../DDD
 
  # Build a separate directory for the images and the mountpoints
 
  mkdir DD-img DD-mnt
 
  cd DD
 
  for i in D*; do
 
    # Create the image
 
    mksquashfs-no-compression $i ../DD-img/$i.squashfs
 
    # Create the mountpoint for your future jobs
 
    mkdir ../DD-mnt/$i
 
  done
 
 
'''mksquashfs-no-compression''' is a simple wrapper to '''mksquashfs''' that
 
disable any kind of compression to focus on speed. Feel free to try
 
'''mksquashfs''' directly with other options like '''-comp lzo''' to
 
save disk space.
 
 
Some mksquashfs hints:
 
* if the destination image exist, the source files/directories will be added (appended) to the image.
 
**  In addition, if a file/directory with a same name already exist in the image, the new file/directory will be added with the name xxx_1 xxx_2, etc, where xxx is the original name.
 
* If a single directory is specified (i.e. mksquashfs source output.squashfs) the squashfs filesystem will consist of that directory, with the top-level root directory corresponding to the source directory.
 
** use the '''-keep-as-directory''' option to tell mksquashfs to keep the basename of the directory in its output.
 
* If multiple source directories or files are specified, mksquashfs will merge the specified sources into a single filesystem, with the root directory containing each of the source files/directories.  The name of each directory entry will be the basename of the source path. If more than one source entry maps to the same name, the conflicts are named xxx_1, xxx_2, etc. where xxx is the original name.
 
 
=== Mounting squashfs images ===
 
 
To mount one image, simply call: '''sudo mountimg <image path> <directory>'''
 
 
To unmount: '''sudo mountimg -u <directory>'''
 
 
Example: mount every squashfs images of /data/.../DDD/DD-img/ on the
 
corresponding sub-directory under /data/.../DDD/DD-mnt/
 
 
  cd /data/.../DDD/DD-mnt || exit
 
  for i in *; do
 
    sudo mountimg ../DD-img/$i.squashfs $i || exit
 
  done
 
 
In an oar job, a mount done with mountimg will be automatically
 
unmounted when the job terminates.
 
 
Such a mount can also be shared by more than one oar job and by more
 
than one user. In this case, the unmount will be done when all the jobs
 
terminate. Beware that every job has to do this mount to register to
 
the list of processes needing it.
 
 
mountimg allows actually to mount at most 4000 images on a node.
 

Version actuelle datée du 13 février 2019 à 08:42