Discussion:FAQ new config : Différence entre versions

De ClustersSophia
Aller à : navigation, rechercher
(Fin draft squashfs/mountimg)
 
(3 révisions intermédiaires par le même utilisateur non affichées)
Ligne 1 : Ligne 1 :
Draft squashfs/mountimg
 
  
== How can I use many small files efficiently? ==
 
 
You can gain in performance and minimize the pressure under /data in
 
the following  cases:
 
* '''case1''' your jobs are only reading under the directories where your zotfiles reside
 
* '''case2''' your jobs are reading your zotfiles but add new files in them
 
* '''case3''' your jobs generate zotfiles, but they will be accessed only for reading or adding new files afterwards
 
 
For '''case1''':
 
* convert your zotfiles directories to squashfs images
 
* in your jobs:
 
** mount those images using '''sudo mountimg'''
 
** use those mounted directories for processing
 
 
For '''case2''':
 
* convert your zotfiles directories to squashfs images
 
* in your jobs:
 
** mount those images using '''sudo mountimg'''
 
** use those mounted directories for processing but generate new files on the local filesystems of the node (ex: /tmp)
 
** unmount the images with '''sudo mounting -u'''
 
** add the new files to the images with '''mksquashfs-no-compression'''
 
 
For '''case3''':
 
* in your jobs:
 
** generates your zotfiles on the local filesystems of the node (ex: /tmp)
 
** convert them to squashfs images under /data with '''mksquashfs-no-compression'''
 
 
=== Creating squashfs images ===
 
To convert your zotfiles to images, choose first the granularity
 
apropriate to your case.
 
 
'''sudo mounting''' allows actually to mount at most 4000 images on a node.
 
 
If you have for example a really big directoy /data/.../DDD/DD/
 
containing hundreds of sub-directories D1 D2 ... DN, you may prefer to
 
make one image per such directory.
 
 
Example (in bash):
 
 
  cd /data/.../DDD
 
  # Build a separate directory for the images and the mountpoints
 
  mkdir DD-img DD-mnt
 
  cd DD
 
  for i in D*; do
 
    # Create the image
 
    mksquashfs-no-compression $i ../DD-img/$i.squashfs
 
    # Create the mountpoint for your jobs
 
    mkdir ../DD-mnt/$i
 
  done
 
 
then in your jobs, if you need to mount all those images:
 
 
  cd /data/.../DDD/DD-mnt || exit
 
  for i in *; do
 
    sudo mounting ../DD-img/$i.squashfs $i || exit
 
  done
 
 
Some mksquashfs hints:
 
* if the destination image exist, the source files/directories will be added (appended) to the image.
 
** use the '''-noappend''' if you want to re-create completely the image, or remove it first.
 
* If a single directory is specified (i.e. mksquashfs source output_fs) the squashfs filesystem will consist of that directory, with the top-level root directory corresponding to the source directory.
 
** use the '''-keep-as-directory''' option to circumvent that.
 
* If multiple source directories or files are specified, mksquashfs will merge the specified sources into a single filesystem, with the root directory containing each of the source files/directories.  The name of each directory entry will be the basename of the source path.  If more than one source entry maps to the same name, the conflicts are named xxx_1, xxx_2, etc. where xxx is the original name.
 
 
'''mksquashfs-no-compression''' is a simple wrapper to mksquashfs that
 
disable any kind of compression to focus on speed. Feel free to try
 
'''mksquashfs''' directly with other options like '''-comp lzo''' to
 
save disk space.
 
 
=== Using sudo mountimg ===
 

Version actuelle datée du 13 février 2019 à 08:42