Changes

Parker Addison · 3a0c7463
--- a/Notes.md
+++ b/Notes.md
 # Filesystem Benchmarks

+- [IOR](#ior)
+  - [Setup](#setup)
+  - [Running](#running)
+- [IO-500](#io-500)
+  - [Setup](#setup-1)
+  - [Running](#running-1)
+- [Custom image for IO500 dependencies](#custom-image-for-io500-dependencies)
+- [FIO benchmark](#fio-benchmark)
+  - [References, graphs, and job files](#references-graphs-and-job-files)
+- [Understanding IOR output](#understanding-ior-output)
+
 ###### 03/09-03/13 2021

 ## IOR
@@ -297,3 +308,41 @@ This is probably a good search: https://github.com/search?q=fio+benchmark&type=R

 > ###### [957b70c](ssh///https://gitlab-ssh.nautilus.optiputer.net/30622/parkeraddison/filesystem-benchmarks/commit/957b70caaca3f180ca323dbf1045965823f349df)
 > "Exploration of IOR and FIO benchmarks; Noteful wiki" | HEAD -> main | 2021-03-14
+
+###### 03/15
+
+Time to figure out how to start making sense of and plotting the outputs. That way I can make sure that IO500 and/or FIO are good choices to pursue.
+
+Once that's done, we can start to figure out how to run this on Pleiades. Henry mentioned that a Python virtualenv would be one way to get specific software (I think one of the repos above is a Python wrapper...). Some packages should [already be available](https://www.nas.nasa.gov/hecc/support/kb/using-software-packages-in-pkgsrc_493.html). Also, I'd expect that *as an HPC environment* lots of the software needed for these HPC filesystem benchmarks should already be present!
+
+## Understanding IOR output
+
+Some description of IOR output: https://gitlab.msu.edu/reyno392/good-practices-in-IO/blob/dfcff70e9b9e39f1199f918d1a4000f44bc1b384/benchmark/IOR/USER_GUIDE#L686
+
+Looks like the charts seen in some of the papers I came across earlier (e.g. [this one](https://cug.org/proceedings/cug2014_proceedings/includes/files/pap162.pdf)) were made using an I/O profiler "[Darshan](https://www.mcs.anl.gov/research/projects/darshan/)". I'm sure there must be a profiler used at NAS.
+
+Seems like a hopeful reference: https://cug.org/5-publications/proceedings_attendee_lists/2007CD/S07_Proceedings/pages/Authors/Shan/Shan_slides.pdf.
+
+**The useful outputs** of IOR are simply *read* and *write* **bandwidth** in Me(bi/ga?)bytes per second and **operations per second**.
+
+**The charts** seen in papers and presentations, such as [here](https://cug.org/5-publications/proceedings_attendee_lists/2007CD/S07_Proceedings/pages/Authors/Shan/Shan_slides.pdf#page=14), are the result of *multiple runs of IOR with different parameters.*
+
+For example, useful charts may demonstrate how bandwidth changes as transfer size, effective file size per processor, or number of processors increases.
+
+This is something I could (hopefully easily) whip up and have it be useful -- run a bunch of IOR tests on a parameter grid. The Lustre docs do [this exact thing](https://wiki.lustre.org/IOR#Example:_IOR_Read_.2F_Write_Test.2C_Single_File.2C_Multiple_Clients) in their example, going from 1,2,4,8 processors.
+
+I might try this out right now on Nautilus... let me go ahead set up a slightly larger volume.
+
+- [ ] Talk with John/Dima about how large of a volume and how many pods I can set up for future benchmarking on Nautilus
+
+---
+
+###### 03/16
+
+Things are starting to make more sense and work more consistently with IOR and FIO runs in Nautilus.
+
+One thing I'm not fully sure the importance of or how to use is specifying a file in IOR... For instance, if I create a file of random bytes like seen [here](https://gitlab.msu.edu/reyno392/good-practices-in-IO/-/blob/master/generateinput.sh) is there any point to using that as an existing file to read from? Ah... perhaps there is a point. I could create multiple small files or one very large file... this coupled with filePerProc... maybe that's the point.
+
+
+> ###### [c57f6dd](ssh///https://gitlab-ssh.nautilus.optiputer.net/30622/parkeraddison/filesystem-benchmarks/commit/c57f6dd8feb60eedf5a0c0ae809f30e3ef91f111)
+> "Minimal IOR test script; Repo organization" | HEAD -> main | 2021-03-16