-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add --pid-file
option to write PID files
#25474
base: main-2.x
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,6 +9,7 @@ import ( | |
nethttp "net/http" | ||
"os" | ||
"path/filepath" | ||
"strconv" | ||
"strings" | ||
"sync" | ||
"time" | ||
|
@@ -248,6 +249,10 @@ func (m *Launcher) run(ctx context.Context, opts *InfluxdOpts) (err error) { | |
} | ||
} | ||
|
||
if err := m.writePIDFile(opts.PIDFile); err != nil { | ||
return fmt.Errorf("error writing PIDFile %q: %w", opts.PIDFile, err) | ||
} | ||
|
||
m.reg = prom.NewRegistry(m.log.With(zap.String("service", "prom_registry"))) | ||
m.reg.MustRegister(collectors.NewGoCollector()) | ||
|
||
|
@@ -970,6 +975,40 @@ func (m *Launcher) initTracing(opts *InfluxdOpts) { | |
} | ||
} | ||
|
||
// writePIDFile will write the process ID to pidFile and register a cleanup function to delete it during | ||
// shutdown. If pidFile is empty, then no PID file is written and no cleanup function is registered. | ||
func (m *Launcher) writePIDFile(pidFile string) error { | ||
if pidFile == "" { | ||
return nil | ||
} | ||
|
||
// Create directory to PIDfile if needed. | ||
if err := os.MkdirAll(filepath.Dir(pidFile), 0777); err != nil { | ||
return fmt.Errorf("mkdir: %w", err) | ||
} | ||
|
||
// Write PID to file | ||
pidStr := strconv.Itoa(os.Getpid()) | ||
if writeErr := os.WriteFile(pidFile, []byte(pidStr), 0666); writeErr != nil { | ||
// Let's make sure we don't leave a PID file behind on error. | ||
removeErr := os.Remove(pidFile) | ||
return fmt.Errorf("write file: %w; remove file: %w", writeErr, removeErr) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps add the path to the error message There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm wondering if we want to use something lower level than There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tend to find using the PID file as a lock file more annoying than useful (looking at you PostgreSQL!). It also generates tons of support questions. The less annoying way of preventing multiple instances is to check for for already listening sockets. We kind of do that today by aborting if we can't bind to the bind addresses. We do the abort after we've opened the PID file, so we would blow away the PID file on shutdown in those situations even though the other (presumed) influxd instance is running. If we want more robust handling of the PID file if you accidentally launch multiple influxd instances with the same configuration, my suggestion would be a feature request to try connecting to the bind addresses very early on to see if anything is listening on them, or use a Unix domain socket / named pipe for the check. This way when the application dies, there is nothing listening or responding on the socket. With a PID or lock file, the file can be stay around if the application is not shut down cleanly. Or we could simply write the PID after we bind the sockets. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On second thought, I don't like the idea of dropping the PID file after we're listening for connections, because that could put a long delay between There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My suggestion then, to avoid support questions, would be to check for existence, issue a warning if present, and then continue, rather than a fatal error. That will let the customer know that the WAL flush may not have happened on shutdown, for instance. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we can do a compromise between the PID approaches. If you specify Adding an extra parameter does add complexity, but the impact of Thoughts? |
||
} | ||
|
||
// Add a cleanup function. | ||
m.closers = append(m.closers, labeledCloser{ | ||
label: "pidfile", | ||
closer: func(context.Context) error { | ||
if err := os.Remove(pidFile); err != nil { | ||
return fmt.Errorf("removing PID file %q: %w", pidFile, err) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the way! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this case the caller to the cleanup function has no context, so the error must contain the filename. |
||
} | ||
return nil | ||
}, | ||
}) | ||
|
||
return nil | ||
} | ||
|
||
// openMetaStores opens the embedded DBs used to store metadata about influxd resources, migrating them to | ||
// the latest schema expected by the server. | ||
// On success, a unique ID is returned to be used as an identifier for the influxd instance in telemetry. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,9 @@ import ( | |
"encoding/json" | ||
"io" | ||
nethttp "net/http" | ||
"os" | ||
"path/filepath" | ||
"strconv" | ||
"testing" | ||
"time" | ||
|
||
|
@@ -14,6 +17,7 @@ import ( | |
"github.com/influxdata/influxdb/v2/http" | ||
"github.com/influxdata/influxdb/v2/tenant" | ||
"github.com/stretchr/testify/assert" | ||
"github.com/stretchr/testify/require" | ||
) | ||
|
||
// Default context. | ||
|
@@ -164,3 +168,21 @@ func TestLauncher_PingHeaders(t *testing.T) { | |
assert.Equal(t, []string{"OSS"}, resp.Header.Values("X-Influxdb-Build")) | ||
assert.Equal(t, []string{"dev"}, resp.Header.Values("X-Influxdb-Version")) | ||
} | ||
|
||
func TestLauncher_PIDFile(t *testing.T) { | ||
pidDir := t.TempDir() | ||
pidFilename := filepath.Join(pidDir, "influxd.pid") | ||
|
||
l := launcher.RunAndSetupNewLauncherOrFail(ctx, t, func(o *launcher.InfluxdOpts) { | ||
o.PIDFile = pidFilename | ||
}) | ||
defer func() { | ||
l.ShutdownOrFail(t, ctx) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ran in to a similar thing as this recently: #25398 (comment) I think that this method call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's inside a lambda, so it will not be evaluated until the defer calls the lambda. Also note that if it got called immediately the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @devanbenz - the arguments to a deferred function are evaluated on the execution of the |
||
require.NoFileExists(t, pidFilename) | ||
}() | ||
|
||
require.FileExists(t, pidFilename) | ||
pidBytes, err := os.ReadFile(pidFilename) | ||
require.NoError(t, err) | ||
require.Equal(t, strconv.Itoa(os.Getpid()), string(pidBytes)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add the path in the
fmt.Errorf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context of the PID filename is added by the caller:
influxdb/cmd/influxd/launcher/launcher.go
Lines 252 to 254 in a91ba1c
I figured this was acceptable elimination of code inside
writePIDFile
since it is a private, single-use method. For a public method I definitely would have added the filename into each message.