Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cancelled, not using Crystal anymore] Add abstract unix socket handling #4056

Closed

Conversation

bew
Copy link
Contributor

@bew bew commented Feb 21, 2017

Will fix #3936

Here is my first PR, I hope everything is good!

There are 2 things where I need review (there are comments in the code):

If there are some specs missing, tell me and I'll add them
I'll squash the commits in the end, when we're good!

Thanks in advance!

Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of comments everywhere, mostly if can be simplified, and documentation kept as it was, but document that the socket can be abstract, then explain how different it is from normal UNIX socket, and maybe how it works (briefly).

BTW I'm not sure there is a test that verifies that we can create a STREAM server, connect a client, then transmit data? Maybe a test with DGRAM would be good, too.

addr2.path.should eq(addr1.path)
addr2.abstract?.should be_true
addr2.abstract?.should eq(addr1.abstract?)
addr2.to_s.should eq("/abstract-service.sock")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll may want to check whether sockaddr_un.sun_path actually got the null byte prepended, that addr.path didn't, and that #abstract? is true. Other expectations are redundant with the other test.

File.exists?(path).should be_false
end

File.exists?(path).should be_false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may get rid of the outer File.exists? expectations.

spec/std/socket_spec.cr Outdated Show resolved Hide resolved
@@ -256,6 +296,19 @@ describe UNIXServer do
end
end

it "returns an abstract client UNIXSocket for abstract server" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good test. I didn't thought about that 👍

@@ -289,6 +342,19 @@ describe UNIXServer do
end
end

it "returns an abstract client UNIXSocket for abstract server" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe redundant. Sure it tests with #accept? but maybe we can skip that, since #accept will always call #accept? anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I saw that too, but as there was already 2 redundant tests for non-abstract sockets:

  • describe "accept" > it "returns the client UNIXSocket"
  • describe "accept?" > it "returns the client UNIXSocket"

I wanted to have some coherence and keep 2 redundant tests for abstract sockets... So I will remove the accept? test, but should I remove also the accept? test for non-abstract socket, which is redundant too ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, remove this test only.

# Always deletes any existing filesystam pathname first, in order to cleanup
# Note for non-abstract socket:
# - The socket is bound to a filesystem pathname.
# - It always deletes any existing filesystem pathname first, in order to cleanup
# any leftover socket file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep the original phrasing (more common), then add a note about abstract socket acting on virtual files (less common), thus not creating nor deleting anything.

src/socket/unix_server.cr Outdated Show resolved Hide resolved
def initialize(@path : String, type : Type = Type::STREAM)
# Connects a named UNIX socket.
#
# A non-abstract socket is bound to a filesystem pathname.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as other places: keep the original phrasing; abstract sockets are a special case, and a note on the class if enough to understand that they won't create an actual filesystem filename.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here, do I say Connects a named UNIX socket, bound to a filesystem pathname. (which is not always true)
or Connects a named UNIX socket, bound to a pathname. (I removed the filesystem mention) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the original comment, but add a note to the UNIXSocket class about abstract (as for UNIXServer).

# it is called from UNIXServer#new, where @abstract is already initialized
# but if I remove the abstract argument here, I get the error saying that
# @abstract is not initialized in all initialize... (See issue #4055)
protected def initialize(family : Family, type : Type, @abstract = false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what's to fix. UNIXServer shouldn't set @abstract but pass abstract to the super call, so it will be set, here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see, will do that

src/socket/unix_socket.cr Outdated Show resolved Hide resolved
@bew bew force-pushed the add-abstract-UNIX-socket-handling branch 2 times, most recently from 5ad7b05 to c85d65a Compare February 22, 2017 17:22
@bew
Copy link
Contributor Author

bew commented Feb 22, 2017

Ok, I think I got them all, thanks for your suggestions!


I only spec'ed STREAM socket type, because when you say in #3750:

  • UNIXSocket and UNIXServer can now be used in DGRAM type, in addition to the default STREAM type.

Your example using UNIXSocket might work, but UNIXServer doesn't (and there is no specs about it either).
This code:

require "socket"
server = UNIXServer.new("/tmp/test.sock", type: Socket::Type::DGRAM)
server.close

Crash with: listen: Operation not supported (Errno) (with Crystal 0.20.5)
Because when using DGRAM, we must bind on it, but not listen (as a DGRAM communication is like UDP: connectionless)

Handling niceties (with UNIXServer and 'high level' objects) with DGRAM socket type is not trivial, I think it should involve another PR for DGRAM support for (non-)abstract sockets (and we could reopen #3214 to continue discussion, or open a new issue)

@bew
Copy link
Contributor Author

bew commented Mar 4, 2017

Ping

Is there something missing?
Do you know why the CI fails everywhere?

From the man:

the abstract namespace were introduced with Linux 2.2 and should not be used in portable programs

I don't think the ci kernel is before 2.2, and I have no clue what's broken..

@bew
Copy link
Contributor Author

bew commented Mar 7, 2017

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Mar 7, 2017

This looks good now, but I didn't realize abstract UNIX sockets were Linux specific. They aren't supported by any other unices (e.g. macOS, BSD), and certainly not standardized.

In this light, the abstract: true parameter and abstract? methods aren't good solutions. Maybe we should just have a little hack, expecting @ in the socket path, and only transforming it to the null byte (and vice-versa) for Linux when present, then document it slightly into UNIXAddress.

@bew
Copy link
Contributor Author

bew commented Mar 7, 2017

I didn't realize abstract UNIX sockets were Linux specific. They aren't supported by any other unices (e.g. macOS, BSD), and certainly not standardized.

Oh didn't realized that too

In this light, the abstract: true parameter and abstract? methods aren't good solutions. Maybe we should just have a little hack, expecting @ in the socket path, and only transforming it to the null byte (and vice-versa) for Linux when present, then document it slightly into UNIXAddress.

What you are suggesting is to drop the abstract: true and only rely on some UNIXAddress checks ?

In UNIXSocket and UNIXServer we will have some conditionals based on flag?(:linux) or something like that to check if we're on Linux, and check if the UNIXAddress is abstract and need special behavior ?

@ysbaddaden
Copy link
Contributor

I have a patch against your work. I'll push it a bit later.

@bew
Copy link
Contributor Author

bew commented Apr 5, 2017

@ysbaddaden do you have an update on this ?

@ysbaddaden
Copy link
Contributor

Weird. I did push and verified I did. I must have pushed to the wrong origin. I'll check when I'm back to my computer.

@sdogruyol
Copy link
Member

@RX14 would you like to re-review this?

@bew
Copy link
Contributor Author

bew commented Oct 24, 2017

I rebased & squashed the commits.

I think that we should discuss how we want to handle this, as it's not available in all plateform (it's only a linux thing).

Also, I think the Socket stuff (especially the easy wrapper classes like UNIXServer, TCPSocket) should change a little (see my comment here #4317 (comment)).

@bew bew force-pushed the add-abstract-UNIX-socket-handling branch from bfcda1d to e4de6e2 Compare October 24, 2017 12:40
@ysbaddaden
Copy link
Contributor

I still believe abstraction should still be supported transparently using an ASCII symbol prefix such as @ that UNIXAdress would transform to a NULL byte on Linux but either do nothing or raise on other platforms.

Impact on the existing API would be minimal, and support for Linux transparent.

@RX14
Copy link
Contributor

RX14 commented Oct 24, 2017

Usually i'd disagree, we should use named arguments instead of cryptic symbols in strings, but in fact the fact that this is an abstract unix socket is simply part of the name, placing it in another namespace.

The problem is though, that you can have a file called @foo just fine in linux. Suddenly our method doesn't take arbitrary paths (you have to add ./, but only in an edge case, so people will forget until they want to configure it with a path starting with @).

So i don't know. Perhaps This way is better and raise on unsupported platforms.

@bew
Copy link
Contributor Author

bew commented Nov 27, 2017

Now I think that we should keep it simple and compile-time safe, so using a named arg to create an abstract UNIXAddress and don't rely on a @ prefix. However when using UNIXAddress#to_s we can add a @ prefix for display purpose only.

Regarding availability, I think it is wrong to raise at runtime when we can know at compile-time that it is not supported.
Now, to allow the user to write abstract unix socket code without compile-errors, we could have a CONST (for access in macros), and the user will have to check if they're available before use:

# In stdlib
{% if flag?(:linux) %}
  ABSTRACT_UNIX_AVAILABLE = true
{% else %}
  ABSTRACT_UNIX_AVAILABLE = false
{% end %}

# In user code
{% if ABSTRACT_UNIX_AVAILABLE %}
  # Use abstract unix socket
{% else %}
  # Use normal unix socket
{% end %}

@RX14
Copy link
Contributor

RX14 commented Nov 27, 2017

No, code that compiles on one platform should compile on another.

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Nov 27, 2017

That's just more complex, and requires to adapt the user facing ABI to accommodate a feature only available on one specific platform, and documented as non portable.

I believe using a leading @ hack is acceptable. But since "\0hidden" is a valid string in Crystal now, maybe we can just accept a leading null byte for linux targets only (i.e. abstract namespace), yet make sure any other null byte in the string is still prevented (i.e. security issue), and make sure the string doesn't have any null byte on non-linux targets.

@RX14
Copy link
Contributor

RX14 commented Nov 27, 2017

@ysbaddaden "\0" has always been a valid crystal string because NUL is valid utf8.

The problem is that it just pushes the complexity on the user: now they have to manually add the null byte if they want an abstract socket instead of using a boolean and just using a named arg.

@ysbaddaden
Copy link
Contributor

Isn't that acceptable? After all, the user decided to deal with the complexity of a non portable feature.

My take is: let's take care to not break this if we can, but let's not bother to integrate it nicely.

@RX14
Copy link
Contributor

RX14 commented Nov 27, 2017

I'm on the fence. I guess it's all bikeshedding on a very rare feature in the end. So let's implement the simplest method: allow a NUL at the start of the string and document the \0 hack. We can always change it to a better interface later, and it's easier to go from this simple method to adding a named arg than go back.

@bew bew force-pushed the add-abstract-UNIX-socket-handling branch 2 times, most recently from dc2dc7b to 9bcdc01 Compare November 28, 2017 00:46
@bew
Copy link
Contributor Author

bew commented Nov 28, 2017

I'm ok with that, I implemented it, let me know how you feel about it.

For the specs on abstract socket, as they'll ultimatly fail on anything that's not linux, do I wrap them in {% if flag?(:linux) %} ?

Or should I hide the check somewhere in UNIXAddress (e.g: use non-abstract socket when not on linux) ?
Reflecting that last thought, this could work:

addr = UNIXAddress.new "\0/tmp/some_path"
addr.abstract? # => true   (on Linux)
addr.abstract? # => false  (on non-Linux)

If the path is usable as a non-abstract socket the same code could work seamlessly.

@bew bew force-pushed the add-abstract-UNIX-socket-handling branch from 9bcdc01 to c81e437 Compare November 28, 2017 00:51
@bew
Copy link
Contributor Author

bew commented Nov 28, 2017

You're right @RX14 thanks, I won't touch anything for a few days until more people gave their opinion and (hopefully) a consensus is found!

Edit: I'll only update little fixes that do not change the objective ;)

end
{% else %}
it "raises for abstract UNIX address on non-Linux" do
expect_raises(ArgumentError, /not supported/) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need a regexp here, using "not supported" as argument will match if the string is included in the error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I though using a String would make it a exact match check. Stdlib is well done, it's an inclusion check, thanks 😄

Copy link
Contributor Author

@bew bew Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's at least one other place in this file where a Regex is used instead of a String for a similar case, but I think it's not in this PR's scope to change that, I'm just mentioning it so that the knowledge is shared (no idea if it's helpful at all ^^)

expect_raises(Socket::Error, /Invalid IP address/) do

end
end

it "is not abstract on non-Linux" do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this example should not run on linux target?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no indeed

server.abstract?.should be_false
end
end
{% end %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two macro branches can be combined.

File.exists?(path).should be_false

UNIXServer.open('@' + path) do
File.exists?(path).should be_false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also doublecheck that '@' + path doesn't exist either? Might be paranoid...

raise ArgumentError.new("Path size exceeds the maximum size of #{MAX_PATH_SIZE} bytes")
end

if path[0]? == '@'
Copy link
Member

@straight-shoota straight-shoota Nov 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path.starts_with?('@') ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I didn't want to do that because unconsciously for me starts_with? is to check that it starts with a String, not a Char....

Nevermind I'll change it.

@path = String.new(sockaddr.value.sun_path.to_unsafe)

path = sockaddr.value.sun_path
if path[0]? == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path.starts_with?(Char::ZERO) ?

Copy link
Contributor Author

@bew bew Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path is not a String here, it's a StaticArray(UInt8).

Based on that, you made me think that it would be even more correct to check for equality with 0_u8 (even though there's an automatic conversion happening here somewhere)

io << path
end

def to_unsafe : LibC::Sockaddr*
sockaddr = Pointer(LibC::SockaddrUn).malloc
sockaddr.value.sun_family = family
sockaddr.value.sun_path.to_unsafe.copy_from(@path.to_unsafe, @path.bytesize + 1)

destination = sockaddr.value.sun_path.to_unsafe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sun_path is a LibC::Char[?]: Use a Slice with #to_slice instead of raw #to_unsafe Pointer.

destination[0] = 0_u8
destination += 1
end
destination.copy_from(@path.to_unsafe, @path.bytesize + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is the reason you should prefer Slice: Your write to destination is bound to the size of @path, and not checked to the size of destination. You get this check for free with Slice.

Copy link
Contributor Author

@bew bew Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Papierkorb !
What about @path should I keep the #to_unsafe? (I think it's not harmful here, but maybe there's a better way I don't see?)

Also, I can't remember/can't find what this + 1 in @path.bytesize + 1 for the count is for, it seems to work without it, soo... If there's no objection I'll remove it with the .to_slice fix soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I can't remember/can't find what this + 1 in @path.bytesize + 1 for the count is for, it seems to work without it

Looks like it was intended to copy the trailing NUL. You don't "need" it right now cause the call to Pointer#malloc above asks BoehmGC to allocate it, and BoehmGC zeros memory for you. Thus, you should either 1) Copy the trailing NUL 2) or set it manually. Not doing so makes that code rely on an implementation detail of the current GC implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @path.to_unsafe could be replaced with String#to_slice. I'm not sure if the returned slice contains the trailing NUL, please verify that.

Copy link
Contributor

@RX14 RX14 Nov 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Papierkorb the returned slice doesn't contain the null - why would it? Slice is a crystal type and crystal doesn't use null terminated strings. The null is only there as a convenience to C.

@ysbaddaden
Copy link
Contributor

IMHO: you can drop most of the PR, and merely:

  1. add a UNIXAddress#abstract? method that returns true if the target is Linux + the string is at least 1 char + the first char is a null byte, and returns false otherwise (no need for @abstract, no need to meddle with @path in UNIXAddress);
  2. memorize @abstract = addr.abstract? in UNIXServer;
  3. don't delete the file in UNIXServer#close if @abstract.

That is, a best effort to support the abstract namespace for UNIX sockets on Linux with minimal changes. I don't think we have to document it, either, but if we do, a oneliner would be enough:

You may use abstract sockets on Linux by prefixing the path with a null byte, for example "\0hidden".

@straight-shoota
Copy link
Member

I want to weigh in on the usage of @ as an indicator for abstract paths, similar to what I've previously argumented about supporting Unix sockets in HTTP::Server:
It should be easy to let the unix socket be configured by an external config setting (command line argument, config file etc.). The usual way to do this is @ as a prefix. It would be terrible user experience if you'd have to use null byte in a configuration value and probably break things.
I don't think the application should be required to interpret such a config value and replace @ with null in order to make abstracts sockets work, as this would easily be forgotten. The stdlib should honor and encourage to use the de-facto standardized format format for user input.
This would not necessarily mean to drop the null byte implementation, but there should at least be an easy way to support @-based abstract socket paths without any additional effort.

@bew bew force-pushed the add-abstract-UNIX-socket-handling branch from df87705 to 5f2f223 Compare December 1, 2017 13:00
@@ -24,17 +24,21 @@ class UNIXServer < UNIXSocket
# Creates a named UNIX socket, listening on a filesystem pathname.
#
# Always deletes any existing filesystam pathname first, in order to cleanup
# any leftover socket file.
#
# Note: An abstract UNIX server act on virtual files, thus not creating nor deleting anything.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note -> NOTE

@bew
Copy link
Contributor Author

bew commented Jan 5, 2018

there should at least be an easy way to support @-based abstract socket paths without any additional effort.

I agree, maybe we can allow UNIXAddress creation with a @-based path, which would be immediately converted to a \0-based path, and stored as is.

The rules I see would be:

  • UNIXAddress.new("\0foo").abstract? => true
  • UNIXAddress.new("@foo").abstract? => true
  • UNIXAddress.new("./@foo").abstract? => false

This would allow to do this PR with minimal changes as @ysbaddaden described, but still allow simple use of abstract unix sockets IMO.

@bew
Copy link
Contributor Author

bew commented Jan 6, 2018

@ysbaddaden I tried to "drop most of the PR" to have only minimal changes here, but there are edge cases that makes me re-add most of the things that I dropped..

Like the simple fact that a UNIXSocket can be created without a path (from its FD, when a UNIXServer accepts a new connection), but I still need to know whether the socket is abstract or not (without a UNIXAddress to do the check for me)..

I used the same specs as this PR, but tried to keep the other changes simple & small.

If some wants to make comments on this other implementation tentative, tell me and I can open another PR (or push it here?).

If someone wants to give it a try, grab the spec file of this PR and try to make minimal changes to make it work, maybe I missed something obvious that you'll do 😃

@jwoertink
Copy link
Contributor

HouseKeeping: @bcardiff I noticed this PR has been approved by 2 core members. Is this still a viable PR? @bew maybe this needs an update rebase?

@straight-shoota
Copy link
Member

@jwoertink Since both approvals where given, the code has changed a lot. It needs a rebase first and then two new reviews.

@bew
Copy link
Contributor Author

bew commented Sep 20, 2021

Hello, I'm cleaning up old PRs of mine on projects I don't follow anymore.

What do we do with this PR ? Close? or someone wants to take it?

@bew
Copy link
Contributor Author

bew commented May 10, 2024

Following my last comment, I'm cleaning up my old PRs, and this seems to be going nowhere..
The code is probably way outdated now (I didn't follow Crystal's development for a few years now 😬)

If anyone wants to ever implement this, please re-open the PR as yourself.
I'll close this, have a good day everyone, contributing to Crystal have been a lot of fun o/

@bew bew closed this May 10, 2024
@bew bew changed the title Add abstract unix socket handling [cancelled, not using Crystal anymore] Add abstract unix socket handling May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception when closing UNIXServer with abstract socket address