Monday, May 21, 2018

What could the HDFGroup do to improve HDF5 (and HDF4)

Back in 2014, I tried to do a bit of cleanup on HDF4 & HDF5.  I'm sure my comments are long out of date...


A few things would accelerate contributions from the community:

  • Mirror the git repos to github and set them up to automatically get updated
  • Setup continuous integration (VI) with travis-ci, appveyor, coveralls, and/or other providers
  • Allow bugs and pull requests on github.  Then people can propose patches and make sure they don't break the builds
  • Make sure primary development is going towards the "master" branch.  That's were people expect to look for the latest changes with git based code.  That's what people expect to send you patches against.

Even and I wrote down some of what we do for GDAL, but it's getting out of date after a few years.  Use these and any other tool you can (e.g. clang-tidy modernize).

http://erouault.blogspot.com/2016/01/software-quality-improvements-in-gdal.html

Other things that I personally think would be great for the HDF team to do:

  • Sign up for OSS Fuzz.  Use google resources try to crash the code and send you bug reports
  • Change the tests to only write into a user controllable temp location.  I run tests from a CAS filesystem that is read only and the runner gets passed the location of where it is allowed to write.
  • Define what the acronyms are in files.  I don't remember what the "AC" means even when I'm in H5AC.c.
  • Drop things like who created each file and when.  Those things are in the version control history and are noise when debugging
  • Have more tests that try to do things in isolation - true unit tests.  e.g. just test a single function.  I usually expect that testing code should be as long or longer than each file.  And each test case only tests one thing
  • Make sure that the code becomes and stays whitespace clean.  Best is to use clang-format, but things like perl -pi -e 's/\s+\n/\n/g' work quite well
  • Don't use line feed characters in the source.  That makes viewing on large screens harder with some tools
  • Consider using a test framework like googletest/gunit.  Many people are experienced with these frameworks and can better follow your tests and contribute back
  • Switch to C++11 as the minimum version of C++.  So many things get easier/better e.g. https://trac.osgeo.org/gdal/wiki/rfc68_cplusplus11

2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. There are lots more things that could be done and I have talked the HDFGroup about this list. Another easy fix:

    codespell | grep == | wc -l
    564

    Some examples:

    ./release_docs/INSTALL_Cygwin.txt:5: libary ==> library
    ./fortran/examples/compound.f90:116: memeber ==> member
    ./fortran/examples/refobjexample.f90:29: wtih ==> with
    ./fortran/examples/testh5fc.sh.in:64: otehr ==> other
    ./fortran/test/tH5T.F90:101: Lenght ==> Length
    ./fortran/test/H5_test_buildiface.F90:17: availablity ==> availability
    ./fortran/test/tH5S.F90:238: occured ==> occurred
    ./fortran/test/tH5S.F90:260: writen ==> written
    ./fortran/test/tH5A_1_8.F90:67: equivelent ==> equivalent
    ./fortran/test/tH5Z.F90:375: occured ==> occurred
    ./fortran/src/H5Pff.F90:5329: arguement ==> argument
    ./fortran/src/H5Df.c:148: accomodate ==> accommodate
    ./fortran/src/H5Pf.c:921: whther ==> whether
    ./fortran/src/H5Dff.F90:470: compatability ==> compatibility
    ./fortran/src/H5f90proto.h:487: frome ==> from
    ./fortran/src/H5_buildiface.F90:334: consistant ==> consistent

    ReplyDelete