Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dreadnaut support #651

Open
wants to merge 66 commits into
base: main
Choose a base branch
from
Open

Conversation

pramothragavan
Copy link

@pramothragavan pramothragavan commented May 26, 2024

Broadly speaking, a dreadnaut file starts with "configuration" information about the graph, such as the number of vertices (denoted by 'n'), the start index for vertex numbering (denoted by '$') and whether or not a graph is a digraph (denoted by the presence of 'd'). The configuration section always ends with a 'g'. The rest of the file gives information concerning individual vertices in the form of adjacency lists. For example:

n=2
$=1
d
g
1: 1 2;
2: 2;

would represent a 1-indexed digraph with 2 vertices with edges {1,1}, {1,2}, {2,2}.

General overview:

Decoder:

  • DIGRAPHS_ParseDreadnautConfig aims to get values for either '$' (which indicates the start index for vertex numbering) or 'n' (which indicates the number of vertices). Note that '$' defaults to 0 and that I chose to reindex all graphs such that vertex numbering starts at one (which I think is convention for the Digraphs package?)
  • DIGRAPHS_LegalDreadnautEdge aims to filter out illegal edges and throws an error if an edge is illegal. An example of an illegal edge might be a loop for an undirected graph or an edge containing a vertex that is not allowed within the constraints of the values of '$' and 'n'. (In the case of illegal edges, nauty throws a warning message and then ignores the edge so I was trying to replicate this behaviour).
  • DIGRAPHS_SplitDreadnautLines effectively takes a line of dreadnaut (e.g. "1: 2 3 5; 4: 2 1 3; 2: 3;") and aims to split this into parts which are to be handled individually (in this case the parts would be ["1: 2 3 5;", "4: 2 1 3;", "2: 3;"]). The idea here is that although usually these parts would each be on their own line, it's techincally fine for some or all of them to share a line (with or without a semicolon) so I thought it made more sense to condense everything onto one line and then split into parts. There are various auxiliary commands that can be used within the dreadnaut format alongside the definition of the graph (more info here) which I mostly chose to neglect, with the exception of 'f' which defines a partition of vertices. Note that '$$' at the end of a file means reindex the graph to start counting at 0 (which I ignored).
  • DIGRAPHS_ParseDreadnautGraph intends to parse the non-configuration part of the file, which has been split into parts after being fed through to DIGRAPHS_SplitDreadnautLines

These are all combined in ReadDreadnautGraph.

Encoder:
WriteDreadnautGraph takes a digraph and encodes into dreadnaut format.

I'm in the process of writing documentation!

Copy link
Member

@james-d-mitchell james-d-mitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking this looks really good! I've added a few comments, mostly about the adding some details to the error messages if possible. Some general comments:

  • it'd be great if the error messages could report in what line of the file the error occurs, I think we discussed this, but don't exactly remember the outcome of this. One approach would be to store the original file contents in a variable, and then search within that for the part that causes the error. I'm not sure if this would actually work or not, just a thought.
  • in the PR description you mention:

Note that '$' defaults to 0 and that I chose to reindex all graphs such that vertex numbering starts at one (which I think is convention for the Digraphs package?)

this sounds appropriate, and yes this is the convention (really more a requirement in Digraphs, i.e. at present it's only possible to have digraphs with vertices [1 .. n] for some n). It'd be best if the code at the very least issued a warning when you are renumber the vertices, to avoid violating the principal of least astonishment (i.e. try to read a graph with nodes not [1 .. n], then silently getting a graph with nodes [1 .. n] would be surprising, so better issue a warning that this is happening). It'd also be useful to have the original vertices as labels in the newly constructed digraph, so if for example the graph has 0-indexed vertices, then the labels of the vertices in the output graph would be set using DigraphSetVertexLabels(D, [0 .. n - 1]); (if this is the correct mapping).

  • You mention in a couple of other places in the PR description that your code potentially ignores some other parts of dreadnaut files, if you detect parts that are ignored for whatever reason then, please issue a warning for each of these too, again to avoid surprising the user.
  • Have you checked how good the code coverage of your tests is? Given the number of lines of code in the implementation versus the number of lines of tests, I'm guessing that there's maybe some work to do there. I've sent you a python script by email that you can use to check the code coverage, just run ./code-coverage-test-gap.py tst/standard/io.tst inside the digraphs directory.

gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
gap/io.gi Outdated Show resolved Hide resolved
@james-d-mitchell james-d-mitchell added new-feature A label for new features. waiting for creator input A label for issues/PRs where we are waiting for the creator to do something labels May 29, 2024
gap/io.gi Outdated Show resolved Hide resolved
@james-d-mitchell
Copy link
Member

@pramothragavan please let me know when you think this is ready again, and thanks !

@pramothragavan
Copy link
Author

@pramothragavan please let me know when you think this is ready again, and thanks !

Will do!

@mtorpey
Copy link
Collaborator

mtorpey commented Jan 28, 2025

Hi @pramothragavan! Looking forward to hopefully seeing you soon for the new VIP.

What state did this Dreadnaut project get to? Would it be a good thing for you to get back into this semester if there's work still to do on it?

@pramothragavan
Copy link
Author

pramothragavan commented Jan 28, 2025 via email

@pramothragavan
Copy link
Author

This is a significant overhaul on previous versions -- WriteDreadnautGraph is untouched, but the decoder has been completely rewritten.

As @james-d-mitchell suggested, I've taken the parser used in the dreadnaut program and effectively rewritten it in GAP. The original C code uses a stream to parse character by character. GAP has a Stream object, but this lacks some of the functionality needed, so I created a record called Stream that aligns GAP's streams with how they're used in C. Other helper functions I've added:

  • DIGRAPHS_GETNWC finds the next character in the stream that is not in " ,\t"
  • DIGRAPHS_GETNWL finds the next character in the stream that is not in " \n\t\r"
  • DIGRAPHS_readinteger reads integers from the stream (i.e. avoiding issues with reading "10" as opposed to "1" and "0" that might arise when parsing character by character)
  • DIGRAPHS_GetInt also reads the next integer from the stream. There are some instances where dreadnaut allows for an optional '=' character (e.g. n=2 is the same as n2). This function ignores any '=' characters and then calls DIGRAPHS_readinteger.
  • DIGRAPHS_readgraph parses the graph's adjacency data
  • DIGRAPHS_ParsePartition is used to parse a partition, if given. The partition is stored using vertex labels.

Documentation for various commands is given here (pages 6-12). Many of these are used to manipulate the graph and I have focused on supporting commands more closely tied to directly defining the graph.

For now, I need to write (many) tests but I'm also interested if there are any commands that you'd like to see support for. I'm happy to implement anything really, but didn't want to waste time on things you didn't want. The commands that I am currently supporting are:

  • All of those mentioned in section (A) of the above link. In dreadnaut, these would just define the mode which dreadnaut is using. This is important for subsequent use of nauty/traces, but is irrelevant for actually reading in the graph so this is just ignored.
  • From section (B): n=#, g (and all subcommands), _, __
  • From section (C): f
  • From section (D): $=#, $$, +, d, -d
  • From section (F): "...", !, q

I think a couple of the unsupported commands from (B) might be worth looking into. Anything unsupported currently should raise an InfoWarning, with the exception of <, >, e (these three relate to reading in, outputting and editing graphs) which raise ErrorNoReturn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-feature A label for new features. waiting for creator input A label for issues/PRs where we are waiting for the creator to do something
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants