Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement FilePath.getCurrentWorkingDirectory() #71

Closed
wants to merge 3 commits into from

Conversation

GeorgeLyon
Copy link
Contributor

@GeorgeLyon GeorgeLyon commented Nov 13, 2021

I'm not sure how to mock system_getcwd, but otherwise this works. We may eventually want to have a stack-allocated buffer of PATH_MAX size so we don't do two allocations, but I don't think we can achieve this in present-day Swift.

@karwa
Copy link

karwa commented Nov 13, 2021

Calling getcwd with a null pointer is technically unspecified by POSIX. Also, PATH_MAX is not the maximum length of a path (this is a common misconception, see this article):

If the buffer you supply is too small, getcwd() will return -1 and set errno to ENAMETOOLONG. Since paths can be of arbitrary size, to correctly use getcwd() you actually need a loop that resizes the underlying buffer and retries when this happens.

Glibc has get_current_dir_name, which is a better alternative.

It may also be worth emphasising the point about this being process-wide mutable state. For example, Foundation's documentation has a big, red warning box about it.

Personally, I think the concept of a process-wide, mutable "current working directory" is a serious mistake; a dinosaur we should allow to go extinct. One thread can start working on some files, assuming one working directory, and then another thread can change the working directory and affect the way paths are resolved by the other thread. There's a good argument for leaving this out of swift-system, and for higher-level libraries such as Foundation to define their own working directory concepts using Task-local state.

Copy link
Contributor

@milseman milseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! @lorentey your take?

The man page says:

These routines have traditionally been used by programs to save the name of a working directory for the purpose of returning to it. A much faster and less error-prone method of accomplishing this is to open the current directory (`.') and use the fchdir(2) function to return.

I think it makes sense to also land changeWorkingDirectory in the same release as this, but where would it live? (This doesn't have to block this PR, of course).

defer { system_free(cwd) }
return FilePath(platformString: cwd)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way back when I was thinking this would be hosted on some kind of current process type, but I think this approach makes more sense. It's also available for typename elision IIUC:

func foo(_: FilePath)
let x = foo(.getCurrentWorkingDirectory())

I feel like making this a function rather than a throwing var makes sense (ignoring tools version concerns) as it will make a syscall.

@GeorgeLyon
Copy link
Contributor Author

Personally, I think the concept of a process-wide, mutable "current working directory" is a serious mistake; a dinosaur we should allow to go extinct.

I'm amenable to this idea, but perhaps the piece that should "go extinct" is the mutable bit, meaning we don't provide API to set the working directory (or implement it as unavailable). I have similar thoughts about setenv. I think it still makes sense for a process to ask "which directory was I launched in", since otherwise command-line scripts become pretty useless, and this is why I want this for my Shwift project.

Glibc has get_current_dir_name, which is a better alternative.

get_current_dir_name does seem better, but it requires _GNU_SOURCE and I'm not sure how swift-system deals with this.

@lorentey
Copy link
Member

lorentey commented Nov 13, 2021

Personally, I think the concept of a process-wide, mutable "current working directory" is a serious mistake; a dinosaur we should allow to go extinct.

Swift System's job is to faithfully expose the underlying system's APIs, not to reimagine them. We can be selective about which syscalls we expose, but this is not the right place (nor the right forum) to discuss changing the underlying OS functionality.

I believe there are legitimate reasons Swift programs might want to get (or even set) the working directory, so it seems reasonable for System to expose related syscalls.

@milseman
Copy link
Contributor

get_current_dir_name does seem better, but it requires _GNU_SOURCE and I'm not sure how swift-system deals with this.

Hmm, however NIO handles it. @Lukasa , do you know?

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very much desirable, and I agree it's good to have this as a static FilePath member.

Comments below.

/// Returns the current working directory of this process.
///
/// - Warning: This value is global and care should be taken to make sure it does not unexpectedly change.
public static func getCurrentWorkingDirectoryForProcess() throws -> FilePath {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should lose the get prefix -- per the Swift API design guidelines, functions without side effects should read as noun phrases.

I know you specifically added this in response to discussions, but I would like to also remove the ForProcess suffix. There are two reasons for this:

  • First, I don't think it makes sense to try documenting the precise semantics of these functions through their naming -- it seems futile to me, and it leads to an unwieldy naming scheme.

  • Far, far more importantly, while the default semantics of these syscalls have been pretty stable, there is no reason an OS cannot switch to different behavior in certain circumstances. For instance, on Linux and macOS at least we have non-portable ways to set up a thread-local working directory.

Suggested change
public static func getCurrentWorkingDirectoryForProcess() throws -> FilePath {
public static func currentWorkingDirectory() throws -> FilePath {

///
/// - Warning: This value is global and care should be taken to make sure it does not unexpectedly change.
public static func getCurrentWorkingDirectoryForProcess() throws -> FilePath {
guard let cwd = system_getcwd(nil, 0) else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will allocate twice -- wouldn't it be better if we supplied this with an appropriately sized FilePath buffer and only resize it if we guessed wrong?

If desired, we can also eliminate allocations altogether by adding a variant that takes an inout FilePath. (I don't think we need that.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be better but given all the discussion about how this shouldn't be used often, I preferred the simpler variant rather than adding an infrequently taken second codepath.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FilePath used to have a with-style internal init that used max-path-length, but it turns out there are multiple notions of such a thing and it's probably too complicated as-is (and wastes space). I agree that it doesn't matter for now; in the future we'd like to do it with something like an init on FilePath, which can make the space/speed tradeoffs.

@lorentey
Copy link
Member

lorentey commented Nov 13, 2021

I think it makes sense to also land changeWorkingDirectory in the same release as this, but where would it live? (This doesn't have to block this PR, of course).

I suggest simply adding them as a methods on FilePath and FileDescriptor:

extension FilePath {
  func setCurrentWorkingDirectory() throws
}
extension FileDescriptor {
  func setCurrentWorkingDirectory() throws
}

(We can also call these changeCurrentWorkingDirectory or setAsCurrentWorkingDirectory(), or something similar.)

@karwa
Copy link

karwa commented Nov 14, 2021

Swift System's job is to faithfully expose the underlying system's APIs, not to reimagine them. We can be selective about which syscalls we expose, but this is not the right place (nor the right forum) to discuss changing the underlying OS functionality.

Right, the argument I was making is that it may not be a good idea to expose it. I'm aware that Github comments are not an effective way to change the world's operating systems 😄. This is an ancient API that is deeply flawed when used on modern systems, and almost every OS warns against using them or has introduced better APIs as an alternative:

  • Firstly, POSIX introduced the -at variant of filesystem functions which takes the working directory as a parameter
  • As you mentioned, Linux and macOS have thread-local working directories available using a different API
  • On Windows, relative paths are just straight-up not supported from multithreaded code or shared libraries:

Multithreaded applications and shared library code should not use the GetCurrentDirectory function and should avoid using relative path names. The current directory state written by the SetCurrentDirectory function is stored as a global variable in each process, therefore multithreaded applications cannot reliably use this value without possible data corruption from other threads that may also be reading or setting this value. This limitation also applies to the SetCurrentDirectory and GetFullPathName functions. The exception being when the application is guaranteed to be running in a single thread, for example parsing file names from the command line argument string in the main thread prior to creating any additional threads. Using relative path names in multithreaded applications or shared library code can yield unpredictable results and is not supported.

Microsoft documentation: GetCurrentDirectory

So there is clearly widespread recognition that this is a problem. Whilst it is a very familiar API, it should not be a given that it deserves a place in swift-system, or that it deserves an easy and convenient name IMO. We should be taking the hint that every major system discourages using it.

If we really want to do this, might I suggest that it be named unsafeCurrentWorkingDirectory?

@lorentey
Copy link
Member

I agree with pretty much every word, but I don't think it would be practical for System not to expose getcwd -- the thing is, the concept of a current working directory -- flaws and all -- are inescapable on these platforms, and System already exposes it through the behavior of open when given a relative file path.

Providing a wrapper for getcwd is actively helping people who want to wean off the working directory -- e.g., I expect they might want to call it at least once at process startup to be able to translate relative names to absolute ones.

There are also a myriad valid reasons why people might want to change the current working directory. (I view System as mostly targeting use cases that are below Foundation -- I would go as far as to suggest it is usually a mistake to import System from a regular app.)

If we really want to do this, might I suggest that it be named unsafeCurrentWorkingDirectory?

We normally use the unsafe prefix to denote memory-unsafe functions. getcwd isn't like that.

Also, by this reasoning, pretty much every single API exposed by System ought to be marked unsafe -- these are extremely low-level APIs that should not be directly used by any regular program.

@lorentey
Copy link
Member

lorentey commented Nov 14, 2021

System already exposes it through the behavior of open when given a relative file path.

We definitely need a wrapper for openat, btw! It's sort of shameful we didn't have that in the initial release. 🙈

(That said, I don't think that openat eliminates the need to expose the current working directory, or that it is a full replacement for open -- after all, we need to get the fd to pass to openat somehow.)

Edit: As usual @milseman is way ahead of me 😅-- see #43.

@karwa
Copy link

karwa commented Nov 14, 2021

Sure - I just like to make sure that we're considering all the options, you know?

@milseman
Copy link
Contributor

We should lose the get prefix -- per the Swift API design guidelines, functions without side effects should read as noun phrases.

I (weakly-held) liked the get in the name because this isn't just exposing a stored property or trivially recomputing some information, it's issuing a syscall to fetch the info. It is having a "side effect" in a world where issuing a syscall is a side effect, especially one whose result may change in the presence of threading.

We could argue that making it a throws function already communicates this, and that's understandable to me, but I think there's some wiggle room WRT interpreting the API naming guidelines.

@dduan
Copy link
Contributor

dduan commented May 19, 2022

Because we don't have the luxury of using the spelling cwd, I dare suggest we look beyond prior art and either drop the word Working (Microsoft's Version) or Current (TaylorSwift's Version?)? Normally I don't go after "saving users from typing more". But (get)CurrentWorkingDirectory is quite unwieldy (imagine typing it without autocomplete) compared to (get)CurrentDirectory or (get)WorkingDirectory.

(making no judgement on the word get btw, I see merits both ways!)

@GeorgeLyon
Copy link
Contributor Author

Closing due to lack of activity... feel free to reopen and merge if you get to it

@GeorgeLyon GeorgeLyon closed this Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants