Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AXL_Config #38

Open
gonsie opened this issue May 22, 2019 · 19 comments
Open

AXL_Config #38

gonsie opened this issue May 22, 2019 · 19 comments

Comments

@gonsie
Copy link
Member

gonsie commented May 22, 2019

We should have a config function to set some options... including the number of pthead threads.

@tonyhutter
Copy link
Collaborator

Agreed. An AXL_Config(id, key, value) function could write directly into a 'config' KVtree that hangs off our main kvtree.

@tonyhutter
Copy link
Collaborator

So I'm currently writing some "cancel a transfer" test cases, and I need a way to set config values in AXL. Specifically, I need a 'file_delay' config value that means "in AXL_Dispatch, wait N milliseconds before copying the next file". The idea is that I can deterministically control how long my total transfer takes, so that I can call AXL_Cancel at the right time in the transfer. I'm running into problems where my transfer is finishing before I have time to cancel it.

Here's a strawman prototype for AXL_Config:

/* 
 * Get/set a AXL configuration key
 *
 * This function allows you to get and set AXL configuration key values.
 *
 * key:         The configuration key you're looking up
 * set_value:   If you're setting a value, put your value here.  If you're
 *              getting a value, leave this NULL.
 *
 * On success, this will return the value you're getting or setting.  On
 * failure it will return NULL.  All values are strings.
 */
char *
AXL_Config (int id, char *key, char *set_value)

This will treat all configuration values as strings, which simplifies things a lot. Yes, you'll need to convert numerical values to strings first, but I don't think that will be a big deal, as the number may already be a string anyway (like if you pass in a 'file_delay' value in argv[] to axl_cp).

Some config keys we might want:

file_delay [milliseconds] - Delay N milliseconds between file transfers.
num_threads [number of threads] - Number of threads to use for a transfer (applicable to pthreads, and whatever other xfer types we make multithreaded)
compression [on|off] - Tar up all the files before transfer and decompress them at the desintation. This could be useful for node to node transfers, since scp'ing individual files can be slow.

Thoughts?

@tonyhutter
Copy link
Collaborator

Also, should the AXL_Config values be saved to the statefile?

@gonsie
Copy link
Member Author

gonsie commented Jun 5, 2019

Yes, I would save these setting to the state file.

The AXL_Config prototype looks great to me. And I think that all strings are fine. I would make the return code AXL_SUCCESS or FAILURE, not null or the value (I think you mean this, just the comment wording isn't clear).

@tonyhutter
Copy link
Collaborator

I'd prefer it to return NULL or the value. That way you can call the function directly, like:
printf("file_delay=%s", AXL_Config(id, "file_delay", NULL));

If you return AXL_SUCCESS/FAULURE, you'll need to provide an additional char** to store the key's value into, like:

int AXL_Config (int id, char *key, char *set_value, char **get_value)

It's just a little more awkward to use since you'll need an additional get_value variable to store the result.

@gonsie
Copy link
Member Author

gonsie commented Jun 6, 2019

Okay, that makes sense. So, to get the value, the set_value param is set to NULL? would we ever want to actually set a variable to NULL?

@tonyhutter
Copy link
Collaborator

That's a good point. We could say that setting it to "" clears it:

AXL_Config(id, "file_delay", "");

In that case set_value would be non-NULL, and set_value[0] would be '\0'. If you tried to get a value that had been set to "", it would be a special case, and return NULL. I'm open to other ideas though.

@gonsie
Copy link
Member Author

gonsie commented Jun 6, 2019

As a thought exercise, I'd like to see if the SCR configs could be handled by a similar interface. There may be good reasons why SCR might need a different interface, but maybe not.

The SCR repo includes some examples of configurations in its system config and user config templates.

Could these be set via this interface? My initial thought is yes. Some configs come in groups (such as the settings of a cache directory). But, using multiple calls to _Config, that could achieved.

@tonyhutter what other settings have we thought about making configurable? I know you've mentioned a few in other issues and PRs. I want to think through having those work with this interface as well.

@tonyhutter
Copy link
Collaborator

You could have all the file attribute flags as configurables. So like xaddrs=on preserve_timestamps=on.

Most of the things we'd want to configure would be analogues of the the cp options (http://man7.org/linux/man-pages/man1/cp.1.html). In fact, it would probably be a good idea to name the configs exactly the same as cp command line options do. That would give users a good idea what to expect. It would also be easy for us to write test cases for, since we could verify the AXL behaviour against the equivalent cp option.

@gonsie
Copy link
Member Author

gonsie commented Jun 12, 2019

ping @adammoody

@gonsie
Copy link
Member Author

gonsie commented Jun 12, 2019

Another thought I had: do we care about versioning? What if a user tries to set a configuration that doesn't exist in this version of the library? but maybe that only matters if we are reading a particular version of a config file... not calling config via the runtime.

@adammoody
Copy link
Contributor

adammoody commented Jun 12, 2019

This all sounds good to me.

To answer @gonsie 's question about how this relates to SCR configuration, as you mention in SCR we do have cases of nested configurations, where a given config item has multiple child key/value pairs associated with it. In our SCR config files, we list this nesting on a single line with the first key/value on the line serving as a parent to the remaining key/value entries, which are separated by spaces. We could get away with that on a single line since we only have two layers of nesting.

CKPT=0  SCHEME=XOR  STORE=/dev/shm  SIZE=8  INTERVAL=1
CKPT=1  SCHEME=PARTNER  STORE=/ssd   INTERVAL=10

A more traditional way might be to use indentation to indicate nesting, and if we did that in SCR, one might have something like the following to define multiple redundancy schemes for a single run:

CKPT=0
  SCHEME=XOR
  STORE=/dev/shm
  SIZE=8
  INTERVAL=1

CKPT=1
  SCHEME=PARTNER
  STORE=/ssd
  INTERVAL=10

I don't know if we have this nesting issue showing up in AXL yet, so it might be overkill to worry about it at this point. I also don't know of a good API to cleanly express the nesting -- we could let the user pass in a kvtree I suppose.

@tonyhutter
Copy link
Collaborator

You could do it like this:

AXL_Config(int id, char *config_string)

Set:

AXL_Config(id, "num_threads=4")
AXL_Config(id, "CKPT=0 SCHEME=XOR")
AXL_Config(id, "CKPT=1 SCHEME=PARTNER STORE=/ssd INTERVAL=10")

Get:

AXL_Config(id, "num_threads")		// returns "4"
AXL_Config(id, "CKPT=1 SCHEME")		// returns "XOR"		
AXL_Config(id, "CKPT=1 STORE")		// returns "/ssd"		

@adammoody
Copy link
Contributor

Ah, yes. Good idea. Works for me.

@tonyhutter
Copy link
Collaborator

@adammoody how does SCR config work with respect to config options it doesn't recognize? Does it just ignore them? Store them in the KVTREE but do nothing? Error out?

@adammoody
Copy link
Contributor

It should be storing them in the tree, but nothing processes them. We don't have any error checking in there right now looking for valid key names. Having said that, adding some checks would help users find/fix typos they may have made.

@rhaas80
Copy link
Contributor

rhaas80 commented Jul 31, 2019

This may be a bit beyond the scope of the original question on how to implement an access API for the settings: Is the format of the files and the tree already set in stone? Or would something like YAML have been an option, nesting would look something like this:

checkpoints:
  - CHKPT: 0
    SCHEME:  XOR
  - CHKPT: 1
    SCHEME: PARTNER
    STORE: /ssd
    INTERVAL: 10

parsing (and tree population) via libyaml (https://pyyaml.org/wiki/LibYAML) but the access API (in C a least) ends up a bit more cumbersome since one has to return partial keys for users to manually search through or implement a query API à la "get me the 'checkpoints' sequence member where the 'CHKPT' scalar has value '0'" which is similar to the current proposed API (other than the current proposed one).

Even better would be something like libconfig (https://hyperrealm.github.io/libconfig/) with a file syntax very similar to YAML but somewhat higher level interface.

@gonsie
Copy link
Member Author

gonsie commented Jul 31, 2019

There is no file format now, all the settings are set explicitly (either during config time or by the caller requesting a transfer).

I think we'd have to go through through the code to figure out what configuration options can be set. Similar to what we have in SCR, it would be useful to configure different STORE types that have certain properties. Maybe a config file would look like:

STORE_SOURCE=/ssd  STORE_DEST=/gpfs NATIVE=BB_API
NUM_THREADS=4

Keeping a line of text in the config file that has a group of information (similar to SCR) would be my vote. We'll have to figure out later what is valid for each line. 😐

@adammoody
Copy link
Contributor

You could do it like this:

AXL_Config(int id, char *config_string)

Set:

AXL_Config(id, "num_threads=4")
AXL_Config(id, "CKPT=0 SCHEME=XOR")
AXL_Config(id, "CKPT=1 SCHEME=PARTNER STORE=/ssd INTERVAL=10")

Get:

AXL_Config(id, "num_threads")		// returns "4"
AXL_Config(id, "CKPT=1 SCHEME")		// returns "XOR"		
AXL_Config(id, "CKPT=1 STORE")		// returns "/ssd"		

@rhaas80 implemented this type of set/query interface in SCR_Config. We can start from that code if we later decide we want to use strings to configure the components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants