========================== SUCCESS ===============================================================

Job generation performance: (minimise stat calls) ECFLOW-1244
   We now cache the file-stat calleds in ECfFile(this is for include files only)
   By saving on stat, there was a small but noticeble improvement   

Server<DONE> ECFLOW-846
   When job processing the same head/tail or any include can be opened/closed many times.
   We now cache the file, this so we only open/close once.
   This was highlighted when  job processing, where hundreds of tasks were submitted
   at the same time.
   See test: Base/bin/gcc-5.3.0/release/perf_job_gen ./metabuilder.def
   
Checkpt<DONE>
      Compare performance of writing checkpt to a string, with
      writing to file directly. (It is far more efficient to write 
      a few large writes than many smaller writes.)
      Hence this should be more efficient even *WITHOUT* memory mapped files.
      (DONE: saving of about 0.5-1sec on very large files)
      
Client<DONE>
   When using the class ClientInvoker, most of the user commands, end
   up using Program Options. This was done for a consistent interface.
   i.e ClientInvoker api, will convert each api call into argc/argv
   argument. This will end up using program options.

   Unfortunately Program options is very slow compared to other functionality.
   Hence for critical speed sensitive request we should bypass.
   i.e we call the Client to server command directly.

o Incremental sync: <DONE>
  Make sure we only update change number when a data model change is made.
  The flags were being set/cleared, with out making any changes (i.e same value)
  but change number was still incremented, causing unneccessary memento's.
 
o Minimise system level calls made by the server
  To record all system level calls we call:
  strace -p <server pid> -o output_file.txt
  
  - This shows that during job generation, when creating the output directory.
    we were making too many calls to stat.
    By checking if directory exist, saved on thousands of stats
    <DONE>
    
  - Signal handling;
    We control when we handle child process termination signals.
    We were making unneccessary calls block SIGCHLD
    <DONE>

o <DONE>Move trigger and complete AST into the expression class,
  <DONE> move repeat generate variable into Repeat base class. Save on in memory size.
        Hence reducing size of node, when no trigger expressions.
        + save on initialisation cost
        Big saving in memory size, i.e 8 bytes times 300000 ~2MB in memory

           Release mode:          TEXT      PORTABLE BINARY    BINARY
           load into server       ~18s          
           download from server   ~12.7s           
          
           Load time improved by ~2 seconds

o <DONE> AutoCancel: Very rarely used. But uses a shared ptr serialisation
         This was *because* of python api.
         Investigate use of raw ptr & effect on python api. OK
         There is a performance penality in using shared ptr in serialisation
         even if it is empty.
           
           Release mode:          TEXT      PORTABLE BINARY    BINARY
           load into server       ~18s          
           download from server   ~12s           
           
           improved download by 0.7 seconds
        
o <DONE> Late, Remove shared ptr and replace with raw ptr

   AParser/perf_aparser_timer  $SCRATCH/3199.def::
      This mimicks the download without the network, (but ignore compare)
      Times for Checkpt/restore/compare, 
         Before Changes: 12.42, 13.63, 12.38, 12.54, 14.27: Avg: 13.08
         After Changes:  12.26, 12.17, 13.32, 12.63, 12.79: Avg: 12.6   (saved ~0.4s)

   Client/test/TestPerfForLargeDefs.cpp (saved ~ 0.2 seconds)
      ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:18 Download: 11, 12, 13, 12, 13 Avg: 12.2
      ${ECF_TEST_DEFS_DIR}/mega.def Delete:42 Load:2 Download: 3, 2, 1, 1, 1 Avg: 1.6


o <DONE>Variable state change no 
        Move to Node: (will save (No of variables * 4) - (No of Nodes *4) = ~2MB
        However a change to a single variable will require copy all node variables

   Client/test/TestPerfForLargeDefs.cpp:   port(3142)
   ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:18 Download: 11, 12, 12, 12, 13 Avg: 12
   ${ECF_TEST_DEFS_DIR}/mega.def Delete:41 Load:2 Download: 2, 2, 1, 1, 1 Avg: 1.4
   
   
o <DONE>Move time base attributes into a separate class
   improvement of about 1 seconds for large defs
   
   Client:: ...test_perf_for_large_defs:   port(3142)
   Client/bin/gcc-4.2.1/release/perf_test_large_defs
      ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:16 Download: 10, 11, 11, 11, 12 Avg: 11
      ${ECF_TEST_DEFS_DIR}/mega.def Delete:42 Load:2 Download: 2, 1, 1, 1, 1 Avg: 1.2
      
      
o <DONE>: Fix issue of delete taking to long

   Client:: ...test_perf_for_large_defs:   port(3142)
   ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:16 Download: 10, 11, 11, 11, 12 Avg: 11
   ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Download: 2, 1, 1, 1, 1 Avg: 1.2


o <DONE> Take edit history out of node and place into Defs.

   Save of ~ 0.6 seconds
   
   Client:: ...test_perf_for_large_defs:   port(3142)
   ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:15 Download: 10, 10, 10, 11, 11 Avg: 10.4
   ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Download: 2, 1, 1, 1, 1 Avg: 1.2

o <DONE> After begin, download times are much slower invetsigate
         Times are the same

o <DONE>Distinguish between checkpoint and client comms
        i.e edit history should be check pointed but does *NOT*
            need serialising when requesting full defs.
           + add a custom command to get the edit history, given a path
 
o <DONE>events meter, labels, use into child attributes

        Client:: ...test_perf_for_large_defs:   port(3142)
         ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:14 Begin:3 Download: 9, 10, 10, 10, 10 Avg: 9.8
         ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Begin:0 Download: 2, 1, 1, 1, 1 Avg: 1.2

        oetzi{/tmp/ma0/workspace/ecflow/AParser}:483 --> bin/gcc-4.2.1/release/perf_aparser_timer ${ECF_TEST_DEFS_DIR}/3199.def
         Parsing Node tree and AST creation time = 5.13 parse(1)
         Checkpt(TEXT_ARCHIVE) and reload , time taken   = 9.63 file_size(58317543)  result(1) msg()
         Total elapsed time = 14 seconds
   
0 <DONE>Node suspended, state is being persisted ???

   oetzi{/tmp/ma0/workspace/ecflow}:520 --> AParser/bin/gcc-4.2.1/release/perf_aparser_timer ${ECF_TEST_DEFS_DIR}/3199.def
   Parsing Node tree and AST creation time = 5.18 parse(1)
   Checkpt(TEXT_ARCHIVE) and reload , time taken   = 9.4 file_size(57675793)  result(1) msg()
   Total elapsed time = 14 seconds

o <DONE> Move zombie & verify attr to a separate class , rarely used

      ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:15 Begin:4 Download: 8, 9, 9, 9, 10 Avg: 9
      ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Begin:0 Download: 2, 1, 1, 1, 1 Avg: 1.2

        oetzi{/tmp/ma0/workspace/ecflow/AParser}:538 --> bin/gcc-4.2.1/release/perf_aparser_timer ${ECF_TEST_DEFS_DIR}/3199.def
         Parsing Node tree and AST creation time = 5.34 parse(1)
         Checkpt(TEXT_ARCHIVE) and reload , time taken   = 9.09 file_size(56071732)  result(1) msg()
         Total elapsed time = 14 seconds

o <DONE> Removed shared ptrs form trigger and complete expressions

   oetzi{/tmp/ma0/workspace/ecflow}:646 --> Client/bin/gcc-4.2.1/release/perf_test_large_defs 
   Running 1 test case...
   Starting server
   Client:: ...test_perf_for_large_defs:   port(3142)
   ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:15 Begin:3 Download: 8, 9, 9, 9, 9 Avg: 8.8
   ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Begin:0 Download: 2, 1, 1, 1, 1 Avg: 1.2
   8.8->9.8

   oetzi{/tmp/ma0/workspace/ecflow/AParser}:11 --> bin/gcc-4.2.1/release/perf_aparser_timer ${ECF_TEST_DEFS_DIR}/3199.def
   Parsing Node tree and AST creation time = 4.44 parse(1)
   Checkpt(TEXT_ARCHIVE) and reload , time taken   = 8.7 file_size(56071728)  result(1) msg()
   Total elapsed time = 13 seconds
   8.7->10.72

o <DONE> Move: Generate suite generate variables, do not persist them.
 
  oetzi{/tmp/ma0/workspace/ecflow/AParser}:782 --> bin/gcc-4.2.1/release/perf_aparser_timer ${ECF_TEST_DEFS_DIR}/3199.def
  Parsing Node tree and AST creation time = 4.4 parse(1)
  Checkpt(TEXT_ARCHIVE) and reload , time taken   = 8.66 file_size(56069424)  result(1) msg()
  Total elapsed time = 13 seconds
   8.66->8.78
   
   oetzi{/tmp/ma0/workspace/ecflow}:778 --> Client/bin/gcc-4.2.1/release/perf_test_large_defs
   Running 1 test case...
   Starting server
   Client:: ...test_perf_for_large_defs:   port(3142)
   ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:14 Begin:3 Download: 8, 9, 9, 9, 9 Avg: 8.8
   ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Begin:0 Download: 2, 1, 1, 1, 1 Avg: 1.2

o <DONE>Calendar: Do not persist cache data

   oetzi{/tmp/ma0/workspace/ecflow}:831 --> AParser/bin/gcc-4.2.1/release/perf_aparser_timer ${ECF_TEST_DEFS_DIR}/3199.def
   Parsing Node tree and AST creation time = 4.38 parse(1)
   Checkpt(TEXT_ARCHIVE) and reload , time taken   = 8.65 file_size(56069114)  result(1) msg()
   Total elapsed time = 13 seconds


o <DONE> Track_never objects which are never serialised through pointers

   oetzi{/tmp/ma0/workspace/ecflow}:1060 --> Client/bin/gcc-4.2.1/release/perf_test_large_defs 
   Running 1 test case...
   Starting server
   Client:: ...test_perf_for_large_defs:   port(3142)
   ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:14 Begin:3 Download: 7, 8, 8, 8, 8 Avg: 7.8
   ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Begin:0 Download: 2, 1, 1, 1, 1 Avg: 1.2

   oetzi{/tmp/ma0/workspace/ecflow/AParser}:1043 --> bin/gcc-4.2.1/release/perf_aparser_timer ${ECF_TEST_DEFS_DIR}/3199.def
   Parsing Node tree and AST creation time = 4.59 parse(1)
   Checkpt(TEXT_ARCHIVE) and reload , time taken   = 7.89 file_size(55963684)  result(1) msg()
   Total elapsed time = 12 seconds

o <DONE> Checkpt ( migration), Need fallback where defs written with state

o <DONE> Cache the de-serialisation in the server when returning the FULL definition
  Current for each client request we have:
  
      client1:  --------------> get---------------> Server
                serialise---------<----de-serialize

      client2:  --------------> get---------------> Server
                serialise---------<----de-serialize

      client3:  --------------> get---------------> Server
                serialise---------<----de-serialize

   By caching the de-serialisation process, we can speed up
   the downloads. However whenever there is a state change
   we need to update the cache
 
       client1:  --------------> get---------------> Server
                serialise------<----de-serialisation

      client2:  --------------> get---------------> Server
                serialise---------<----return cache

      client3:  --------------> get---------------> Server
                serialise---------<----return cache
   
   
    Where do we store the cache ? this is important since for
    large definition it can be very large 100MB, hence we
    must avoid copies all cost.
        
    Issues: The check pt should really update the cache, however
    the check pt includes the edit history ???
   
   oetzi{/tmp/ma0/workspace/ecflow}:555 --> Client/bin/gcc-4.2.1/release/perf_test_large_defs 
   Running 1 test case...
   Client:: ...test_perf_for_large_defs:   port(3142)
   ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:14 Begin:3  Download(Sync):9, 0, 0, 0, 0  
                                                            Download(Sync-FULL):5, 5, 4, 4, 4 Avg:4.4  
                                                            Download(FULL):8, 8, 9, 9, 9      Avg:8.6
   ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Begin:0  Download(Sync):2, 0, 0, 0, 0  
                                                           Download(Sync-FULL):0, 0, 0, 0, 0 Avg:0  
                                                           Download(FULL):1, 1, 1, 1, 1      Avg:1

o <DONE>Dont created generated variables in the default constructors, creat on demand

 
   Client:: ...test_perf_for_large_defs:   port(3142)
   ${ECF_TEST_DEFS_DIR}/3199.def Delete:0 Load:12 Begin:3  Download(Sync):7, 0, 0, 0, 0  
                                                            Download(Sync-FULL):4, 4, 4, 4, 4 Avg:4  
                                                            Download(FULL):8, 8, 8, 8, 8 Avg:8
   ${ECF_TEST_DEFS_DIR}/mega.def Delete:0 Load:2 Begin:0   Download(Sync):1, 0, 0, 0, 0  
                                                            Download(Sync-FULL):0, 0, 0, 0, 0 Avg:0  
                                                            Download(FULL):1, 1, 1, 1, 1 Avg:1


o <DONE> Defs::findAbsNode() new algorithm that avoids creating a vector of strings. new algo is 14% faster overall; has: 0fc72c945b6
o <DONE> Avoid branching due to checking if Log::instance(). hash: 08103aa0e73
         Avoid std::stringstream in Command logging/Node log .   hash: 5f3840815f6
         Command interaction: approx %4.3 faster  for 3 changes above 
                                                      