Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return record #3155

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Return record #3155

wants to merge 2 commits into from

Conversation

HansOlsson
Copy link
Collaborator

Enhancement to allow functions to return record values.

See #3130 (comment)

Note that this should be fairly straightforward to implement.

@HansOlsson HansOlsson added the enhancement New feature or request label Apr 26, 2022
Copy link
Collaborator

@henrikt-ma henrikt-ma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be worked out in conjunction with #3154.

@@ -2229,8 +2229,8 @@ \subsubsection{Records}\label{records}
Arrays cannot be mapped.
\end{itemize}

Records are passed by reference (i.e.\ a pointer to the record is being
passed).
Records that are the return value of the C function are returned by value.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be allowed as I don't see that we should rely on any definition of CRec in the signature. See #3154.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a problem with returning structs by value, as it has been done in C since at least the 1980s.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works well if nested records are also stored as struct und not as pointer in the surrounding struct. Else the question about the memory management arises again.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a concern about returning records by value in C in general, but a concern about using this mechanism in Modelica's external function protocol. The problem we have is that the external function should come with a prototype, and that prototype will have a definition of the record type that the tool generating the call to the external function cannot reasonably be expected to be aware of. (At least I don't want a solution where the tool needs to scan files included via the Include annotations in search of a the definition of the record used as the external function's return type.)

This test program demonstrates the problem:

struct A
{
  double x;
  int y;
};

struct B
{
  double x;
  int y;
};

struct A
f()
{
  struct A a;
  a.x = 0.5;
  a.y = 1;
  return a;
}

int
main()
{
  struct B b = f();
  return 0;
}

Clang rejects this with:

return_record.c:25:12: error: initializing 'struct B' with an expression of incompatible type 'struct A'
  struct B b = f();
           ^   ~~~

Beside the problem that the program is rejected by Clang, we have the potential inefficiency of returning big structs by value.

Just for clarity, this is the way I believe it needs to be done:

void
f(void* a)
{
  struct A* a_ = (struct A*)a;
  a_->x = 0.5;
  a_->y = 1;
}

int
main()
{
  struct B b;
  f((void*)(&b));
  return 0;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @henrikt-ma

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is horribly clunky.

First and foremost using void*-pointers is just asking for trouble - and #3198 gives a way of ensuring that the records are consistently defined.

For the possibility of return structs - what many/most/all(?) C-compilers internally do when you declare a function with a large struct as return value is to send in the return address to the function.

So, basically the C-compiler already do the same as this proposal, but automatically, type-safe, and only when needed. Having that as an option for users makes a lot more sense.

chapters/functions.tex Show resolved Hide resolved
chapters/functions.tex Show resolved Hide resolved
@henrikt-ma
Copy link
Collaborator

I think this issue has received too little attention, considering how many MA members that I think would have valuable input to the discussion. Could someone please help me ping the right people? For example, maybe @beutlich, @sjoelund, @christoff-buerger or @andreas-junghanns can relate to the question of how one should return records from the implementation of an external "C" function?

@DagBruck
Copy link
Collaborator

DagBruck commented May 5, 2022

I have problems with your example involving records A and B. Your example is correct Modelica, but it is not correct C, which uses name-equvalence instead of structural equivalence. Ok, you may argue that two structs declared identically should give the same layout, but then somebody adds a #pack declaration and everything falls apart.

My suggestion is that we look if we could make a name mapping, telling the Modelica translator that "for this record please use this type name in the generated code". That allows you to define the type in C and your generated code will adapt in a clean way.

By the way, returning struct as a function return value has been well-defined for decades. Don't think you should do it by sending in "out" pointers. Maybe different for Modeica functions returning multiple values, have we defined that for external functions?

@HansOlsson
Copy link
Collaborator Author

Maybe different for Modeica functions returning multiple values, have we defined that for external functions?

There are two cases for multiple outputs:

  • No external declaration. In that case the C-function is not assumed to return one of them.
  • An external declaration, e.g. external "C" a=foo(b,c); then a is assumed to be a return value (and it should be declared as output). And if b or c (or both) are outputs they are handled as pointers to the outputs.

@henrikt-ma
Copy link
Collaborator

I have problems with your example involving records A and B. Your example is correct Modelica, but it is not correct C, which uses name-equvalence instead of structural equivalence. Ok, you may argue that two structs declared identically should give the same layout, but then somebody adds a #pack declaration and everything falls apart.

Yes, this is the point of the example.

My suggestion is that we look if we could make a name mapping, telling the Modelica translator that "for this record please use this type name in the generated code". That allows you to define the type in C and your generated code will adapt in a clean way.

This is an alternative I have also considered, but I thought it would be considered too complicated compared to using a void* and let each side of the interface define its own representation of the record. Note that one of the problems we'd have to address with this approach is how to mangle all crazy identifiers a Modelica record member could have (the name of the record itself wouldn't necessarily be a problem as long as one could provide an annotation to the function component declaration, telling the name of the corresponding external C record).

By the way, returning struct as a function return value has been well-defined for decades. Don't think you should do it by sending in "out" pointers. Maybe different for Modeica functions returning multiple values, have we defined that for external functions?

Yes, there is no question about returning struct by value being perfectly fine in C in general. The question is specific to returning records from extern "C" in Modelica.

Note that we must anyway support returning records via pointer to allocated storage, in case there are multiple outputs from the function. Thus there is hardly any added value in also allowing the records to be returned by value.

@DagBruck
Copy link
Collaborator

DagBruck commented May 5, 2022

I have problems with your example involving records A and B. Your example is correct Modelica, but it is not correct C, which uses name-equvalence instead of structural equivalence. Ok, you may argue that two structs declared identically should give the same layout, but then somebody adds a #pack declaration and everything falls apart.

Yes, this is the point of the example.

Fine, all in areement here.

My suggestion is that we look if we could make a name mapping, telling the Modelica translator that "for this record please use this type name in the generated code". That allows you to define the type in C and your generated code will adapt in a clean way.

This is an alternative I have also considered, but I thought it would be considered too complicated compared to using a void* and let each side of the interface define its own representation of the record. Note that one of the problems we'd have to address with this approach is how to mangle all crazy identifiers a Modelica record member could have (the name of the record itself wouldn't necessarily be a problem as long as one could provide an annotation to the function component declaration, telling the name of the corresponding external C record).

I mst admit I have never quite understood the name-mangling of record members that some tools seem to apply (_0member).

By the way, returning struct as a function return value has been well-defined for decades. Don't think you should do it by sending in "out" pointers. Maybe different for Modeica functions returning multiple values, have we defined that for external functions?

Yes, there is no question about returning struct by value being perfectly fine in C in general. The question is specific to returning records from extern "C" in Modelica.

But here the question is also how we most easily adapt to what we think is the common programming style in C. If returning structs is common, then that is the natural option. If returning by in/out pointer arguments is common, then we should do that. Maybe we need to search for common real-world patterns.

@christoff-buerger
Copy link
Member

There is a reason why the FMI group avoids structs. Maybe they have some input?

Here we are even talking about language border crossing structs (Modelica records <-> C structs). My assumption is: It needs guided/configured code generation and compilation on the fly on both ends (as we know from techniques like CORBA etc).

@HansOlsson
Copy link
Collaborator Author

HansOlsson commented May 5, 2022

There is a reason why the FMI group avoids structs. Maybe they have some input?

Note that in Modelica 3.4 we added a way to avoid structs - even if the Modelica function has records - by sending each record member separately. So in one way one can argue that structs are only used when the user wants to.

Having a way to specify the C-name of the record, including the header containing its definition (and either with standardized names for the elements or a way of specifying them) would fix the remaining issue.
We could then deprecate using records without a specified C-name for external functions.
Added: Or not deprecate it and view it similar to the default calling of external functions.

However, that is somewhat orthogonal to this PR about returning records from external funcitons.

Here we are even talking about language border crossing structs (Modelica records <-> C structs). My assumption is: It needs guided/configured code generation and compilation on the fly on both ends (as we know from techniques like CORBA etc).

Guided code generation is needed for some cases, but given that the main use case is that you have a specific existing C-library that you want to call I don't think it is needed in most cases for Modelica.

It is clearly interesting if we want to add more generic support of calling other languages, e.g. JNI for Java; https://modelica.org/events/modelica2006/Proceedings/sessions/Session5a3.pdf

@henrikt-ma
Copy link
Collaborator

In another attempt to reach out for input from MAP-FMI people with the deep understanding of this kind of topic, perhaps @pmai has some thoughts to share? (Previously asked @andreas-junghanns, but so far no response.) For background in a nutshell: the topic is how to handle Modelica records (possibly nested) in the external C interface.

@HansOlsson
Copy link
Collaborator Author

In another attempt to reach out for input from MAP-FMI people with the deep understanding of this kind of topic, perhaps @pmai has some thoughts to share? (Previously asked @andreas-junghanns, but so far no response.) For background in a nutshell: the topic is how to handle Modelica records (possibly nested) in the external C interface.

I don't understand what you are asking for, as the deep understanding has already been given.

Returning records has been part of C since at least the 1980s, and is documented in ABIs.
This includes nested records.

The goal of this PR isn't to mandate that function return records, but to allow it - just being a few decades after C. Thus it does not matter that many coding styles don't use it in C (since the return value is a status code as in FMI, and structs are often input/output arguments). In C++ they are used more extensively (also for classes) requiring special optimizations like RVO and move-semantics.

@henrikt-ma
Copy link
Collaborator

I don't understand what you are asking for, as the deep understanding has already been given.

I am hoping that others would also see that there is a clear problem with passing by value when the code we generate for calling the external function doesn't have access to the same struct definitions used in the prototype of the external function (provided by means of a header given in the Include-annotation). The natural way to avoid all problems is to use void pointers (which happens to imply that records won't be possible to return by value, but that's totally fine by me).

Note that we already use void pointers successfully for handling external objects.

@HansOlsson
Copy link
Collaborator Author

I don't understand what you are asking for, as the deep understanding has already been given.

I am hoping that others would also see that there is a clear problem with passing by value when the code we generate for calling the external function doesn't have access to the same struct definitions used in the prototype of the external function (provided by means of a header given in the Include-annotation).

That is not what this PR is about; it's about returning records as stated in the name.
Blocking issues due to unrelated problems is not constructive, please stop with that.

Specifically this PR is about ensuring that external function can handle return values using the same record definition that are already allowed for arguments. That is perfectly supported by the C-standard and has been for decades. Having an Include-annotation would help both in terms of C-standard conformance, but in practice it works for most platforms even without that.

@henrikt-ma
Copy link
Collaborator

That is not what this PR is about; it's about returning records as stated in the name. Blocking issues due to unrelated problems is not constructive, please stop with that.

I'm sorry, but I can't be in favor of the change as long as I don't see how the two could be unrelated. Returning a record means that there would be a struct in the return type of the external function's prototype, and it just doesn't seem feasible to be to generate clean code for calling such a function without having full access to the definition of that struct. It is because I don't see the feasibility of doing that (way too many annotations needed to to make it work), that I think returning records by value from external Modelica functions (not in C in general, of course) is not a good idea. By only returning records by pointers to allocated output memory one would get a realistic setting for both implementors of external functions and for tools that need to generate the code for calling them.

@HansOlsson
Copy link
Collaborator Author

HansOlsson commented Jun 7, 2022

That is not what this PR is about; it's about returning records as stated in the name. Blocking issues due to unrelated problems is not constructive, please stop with that.

I'm sorry, but I can't be in favor of the change as long as I don't see how the two could be unrelated. Returning a record means that there would be a struct in the return type of the external function's prototype, and it just doesn't seem feasible to be to generate clean code for calling such a function without having full access to the definition of that struct.

Sending in a pointer to the struct, as currently allowed, has the exact same issues.
If there are alignment issues, or if an implementation is really insisting on strict conformance for records it will be the same problem for both what we currently have and for returning records.

I'm not against adding Include-annotations for records, but it is a separate issue.

It is because I don't see the feasibility of doing that (way too many annotations needed to to make it work), that I think returning records by value from external Modelica functions (not in C in general, of course) is not a good idea. By only returning records by pointers to allocated output memory one would get a realistic setting for both implementors of external functions and for tools that need to generate the code for calling them.

Not true at all. That has the same problems (with alignment, layout, and strict C conformance for names etc; as we already have for pointers to structs and is proposed for returning records), and adds the possibility of incompatible memory allocation on top of that. (And in practice also requires manual memory management, that people often mess up.)

It's a really really bad idea.

@henrikt-ma
Copy link
Collaborator

Sending in a pointer to the struct, as currently allowed, has the exact same issues. If there are alignment issues, or if an implementation is really insisting on strict conformance for records it will be the same problem for both what we currently have and for returning records.

Yes, the issues are the same, but passing a void * makes it clear that the mechanism can only work if some underlying assumptions about alignment and layout are fulfilled, and it allows generation of code that will compile cleanly.

(And in practice also requires manual memory management, that people often mess up.)

No, as allocated memory is passed to the external function, no manual memory management is required on the external function side. I don't see any difference to the way we return arrays.

@HansOlsson
Copy link
Collaborator Author

Sending in a pointer to the struct, as currently allowed, has the exact same issues. If there are alignment issues, or if an implementation is really insisting on strict conformance for records it will be the same problem for both what we currently have and for returning records.

Yes, the issues are the same, but passing a void * makes it clear that the mechanism can only work if some underlying assumptions about alignment and layout are fulfilled, and it allows generation of code that will compile cleanly.

That would be a step back.

Instead I want to take two separate steps forward:

  1. Add possibility of returning records in this PR.
  2. Solve the declaration of struct (as a separate issue Propose complete handling of records in external functions. #3198 ) which would fully solve the problem of pointers and returning records.

Switching to void* is a proposal that would:

  • Break compatibility for existing code.
  • Increase the risk of errors; as in practice anything will match. Solving the declaration-problem as a separate issue is a better alternative that goes from structs that work in most implementation to a variant that is fully conformant with the C-standard in a backward compatible way.
  • Don't solve anything. Note that going through void* doesn't actually solve any of the compatibility issues with structs; it just attempts to hide them.

Side-note: External objects are different as it's only the external code that accesses the contents of the external object, but these structs are sent between external code and Modelica code, and modified by both.

Proposing void* is not only a bad proposal - it's also a separate proposal, so it would still be blocking an issue due to an unrelated problem, which is still not constructive.

In contrasts extending the declarations of structs in general, #3198 , is a straightforward extension, that takes less time to implement than concluding this pseudo-discussion.

(And in practice also requires manual memory management, that people often mess up.)

No, as allocated memory is passed to the external function, no manual memory management is required on the external function side. I don't see any difference to the way we return arrays.

If "returning records by pointers to allocated output memory" means only handling records by pointer to memory allocated by the caller, it only means that we still have the same minor problem as today - whereas returning records wouldn't introduce any new ones.

Note that this is still only relevant if the external call wants to return a struct it's an entirely optional proposal for modelers.

@henrikt-ma
Copy link
Collaborator

As seen in #3154, we don't really have any working mechanism at all for records at the moment, so whatever we do here should be seen as fixing that problem, not extending upon something that is already working.

I think the best way to proceed is to first settle #3198, and then return to this issue. (I expect that settling #3198 will end up in a clear decision between going with void pointers or going for the suggested complete specification of C structs.)

@HansOlsson
Copy link
Collaborator Author

We can handle #3198 first, but is not blocking this issue.

The reason is that they are largely independent, and thus there is little or no overlap, and therefore no or few merge conflicts.

As seen in #3154, we don't really have any working mechanism at all for records at the moment, so whatever we do here should be seen as fixing that problem, not extending upon something that is already working.

In practice we have a working mechanism for records at the moment and have had since Modelica 1 - and they are used. However, that mechanism relies on specifics of ABIs on most used platforms, but is not fully according to the C-standard. That can be improved in a backwards compatible way in #3198 but it is not needed

@HansOlsson HansOlsson added this to the Phone 2022-3 milestone Jun 12, 2022
@beutlich beutlich requested a review from gkurzbach July 7, 2022 14:32
@HansOlsson HansOlsson modified the milestones: Phone 2022-3, Phone 2022-7 Dec 13, 2022
@@ -2229,8 +2229,8 @@ \subsubsection{Records}\label{records}
Arrays cannot be mapped.
\end{itemize}

Records are passed by reference (i.e.\ a pointer to the record is being
passed).
Records that are the return value of the C function are returned by value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @henrikt-ma

@HansOlsson
Copy link
Collaborator Author

To clarify how ABIs impact this. However, the main idea is to not care and let C-compilers handle the C-code without trying to be clever.

Whether the variable is a struct or not isn't that significant nowadays (it seems that it matters on VAX) - but it somewhat depends on how homogenous the struct is, and as long as the records are "compatible POD" the exact declarations aren't that critical - but #3198 should clear up that.

But after some investigating - for return values on x86-64:

Note that the Windows normal ABI will also handle input structs containing two or more doubles by sending a pointer to the struct - so if we say that since some ABIs cannot return a complex number in registers we shouldn't try to declare a C-function returning such a struct, we should logically then also do the same for any complex input.

The idea is that by just declaring it as a C-function returning a struct (and fixing the types of records as in #3198 ), we let the C-compiler handle it; if users have such functions. If the C-function takes a pointers to the outputs you don't have to say that it returns something, and can use a different external-declaration - as before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants