From 5064a8e97b2cff2a28a5712517c5447aa5f4f4f5 Mon Sep 17 00:00:00 2001 From: Max Malekzadeh Date: Thu, 12 Sep 2024 11:57:47 -0400 Subject: [PATCH 1/4] Author README.md --- README.md | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 106 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index df3e360..a529ccc 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,107 @@ # SnowMint -A blazingly fast unique and roughly sortable IDs generator, based on Twitter's Snowflake + + +SnowMint is a blazingly fast unique and roughly sortable IDs generator based on Twitter's Snowflake. + +## Unique ID Generation +### The Algorithm +### Sorting + +## The Protocol + +## Clients +### Go Client +### Java Client + +## Install +### Native Deployment +### Docker Deployment + +## Benchmarks +Latency: +- In a native deployment, response time for each unique ID is roughly between 5 to 10 microseconds. +- In a Docker container, response time for each unique ID is roughly between 10 to 15 microseconds. + +Here's a structured README for your SnowMint project: + +--------------------------------------------------------------------------------- + + +# SnowMint - A Blazingly Fast Unique ID Generator +[![Go Workflow](https://github.com/mxmlkzdh/snowmint/actions/workflows/go.yml/badge.svg)](https://github.com/mxmlkzdh/snowmint/actions) + +**SnowMint** is a high-performance, distributed unique ID generator based on Twitter's Snowflake algorithm. It provides unique, sortable IDs that are generated using a client/server model with a custom protocol over raw TCP connections. + +## Unique ID Generation + +### Algorithm + +A SnowMint generates IDs using a 64-bit timestamp, a machine ID, a process ID, and a sequence number. This combination ensures uniqueness and allows for efficient sorting based on the timestamp. + +B SnowMint leverages Twitter’s Snowflake algorithm to generate 64-bit unique identifiers. The format of the ID consists of: +- **Timestamp**: 41 bits for time in milliseconds. +- **Node ID**: 10 bits for machine or datacenter ID. +- **Sequence**: 12 bits for a sequence number that resets every millisecond. + +Timestamp: A 41-bit field representing the current time in milliseconds since the epoch. +Machine ID: A 10-bit field identifying the machine where the ID was generated. +Process ID: A 5-bit field representing the process ID. +Sequence Number: A 10-bit field for sequential IDs within a millisecond. + +This combination ensures that SnowMint can generate thousands of unique IDs per second, even in distributed environments. + +### Sorting IDs +Since the first 41 bits represent the timestamp, SnowMint IDs are naturally sortable by creation time. IDs generated earlier will have a smaller numeric value than those generated later, allowing simple chronological ordering by comparing ID values directly. + +## The Protocol + +SnowMint uses a lightweight, highly optimized custom protocol over raw TCP connections. This design focuses on speed and simplicity, ensuring ultra-fast ID generation and retrieval. + +### How it Works: +1. **Connection**: Clients open a TCP connection to the SnowMint server. +2. **Command**: The client sends a single `GET` command to the server. +3. **Response**: The server responds immediately with a 64-bit unique ID. + +This minimalist protocol reduces overhead, delivering unparalleled speed compared to traditional HTTP-based services. + +### Performance Benefits: +- **Raw TCP**: Eliminates HTTP headers and other overhead, reducing the time between a request and response. +- **Low-Latency**: Designed for microsecond-scale latencies, making it ideal for high-throughput systems. + +## Clients + +SnowMint provides easy-to-use SDKs for popular programming languages to integrate with the server and retrieve unique IDs. + +### Go SDK +The Go client SDK allows seamless integration into Go applications. A simple GET request over TCP fetches the unique ID. + +### Java SDK +The Java SDK offers a similarly efficient way to connect to the SnowMint server, providing support for applications in JVM environments. + +## Install + +### Native Deployment +1. Download the latest release from the [SnowMint releases page](#). +2. Extract the archive and run the binary: + ```bash + ./snowmint-server --node-id + ``` + +### Docker Deployment +To run SnowMint in a Docker container, use the following: +```bash +docker pull snowmint/snowmint-server:latest +docker run -d --name snowmint -p 8080:8080 snowmint/snowmint-server --node-id +``` + +## Benchmarks +SnowMint has been benchmarked to handle thousands of requests per second, with latencies in the microsecond range. Thanks to the custom protocol and raw TCP connections, it outperforms traditional HTTP-based systems by a significant margin. + +- **ID Generation Rate**: Up to X,000 IDs per second per node. +- **Latency**: Sub-millisecond, typically under X microseconds. + +## Sources +The SnowMint project is open source and available on GitHub. Check out the [source code](#) to contribute or explore the internals of the system. + +## License +The SnowMint project is licensed under the MIT License. \ No newline at end of file From a3cf251c1256220d7a2e55c17523f7dd2d3ed2ff Mon Sep 17 00:00:00 2001 From: Max Malekzadeh Date: Thu, 12 Sep 2024 12:13:34 -0400 Subject: [PATCH 2/4] Add notes on Time Synchronization in Distributed Systems --- README.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/README.md b/README.md index a529ccc..8b3b108 100644 --- a/README.md +++ b/README.md @@ -53,6 +53,17 @@ This combination ensures that SnowMint can generate thousands of unique IDs per ### Sorting IDs Since the first 41 bits represent the timestamp, SnowMint IDs are naturally sortable by creation time. IDs generated earlier will have a smaller numeric value than those generated later, allowing simple chronological ordering by comparing ID values directly. +### Time Synchronization in Distributed Systems + +In distributed systems, it is **crucial** that all nodes maintain synchronized clocks to ensure the uniqueness of the IDs. Since the Snowflake algorithm heavily relies on the system timestamp (41 bits of the ID represent the time), any drift in a node's clock can lead to the generation of duplicate IDs, which breaks the uniqueness guarantee. + +To avoid this issue: +- **Synchronize Time Across Nodes**: Use tools like **NTP (Network Time Protocol)** or similar to keep system clocks in sync. +- **Monitor Time Drift**: Ensure that the time drift between nodes is kept to a minimum (e.g., within a few milliseconds). +- **Fallback Mechanism**: If a node detects that its clock is out of sync, it should halt ID generation until the clock is corrected to prevent collisions. + +Failing to synchronize time across all nodes may result in non-unique IDs being generated, which could lead to issues in systems where uniqueness is critical. + ## The Protocol SnowMint uses a lightweight, highly optimized custom protocol over raw TCP connections. This design focuses on speed and simplicity, ensuring ultra-fast ID generation and retrieval. From 2fc54bc374752b0c35ae49cc0e08b312113597ac Mon Sep 17 00:00:00 2001 From: Max Malekzadeh <11231195+mxmlkzdh@users.noreply.github.com> Date: Thu, 12 Sep 2024 16:08:14 -0400 Subject: [PATCH 3/4] Update README.md --- README.md | 123 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 63 insertions(+), 60 deletions(-) diff --git a/README.md b/README.md index 8b3b108..5b8680a 100644 --- a/README.md +++ b/README.md @@ -1,61 +1,43 @@ -# SnowMint - - -SnowMint is a blazingly fast unique and roughly sortable IDs generator based on Twitter's Snowflake. - -## Unique ID Generation -### The Algorithm -### Sorting - -## The Protocol - -## Clients -### Go Client -### Java Client - -## Install -### Native Deployment -### Docker Deployment - -## Benchmarks -Latency: -- In a native deployment, response time for each unique ID is roughly between 5 to 10 microseconds. -- In a Docker container, response time for each unique ID is roughly between 10 to 15 microseconds. - -Here's a structured README for your SnowMint project: - ---------------------------------------------------------------------------------- - - # SnowMint - A Blazingly Fast Unique ID Generator [![Go Workflow](https://github.com/mxmlkzdh/snowmint/actions/workflows/go.yml/badge.svg)](https://github.com/mxmlkzdh/snowmint/actions) -**SnowMint** is a high-performance, distributed unique ID generator based on Twitter's Snowflake algorithm. It provides unique, sortable IDs that are generated using a client/server model with a custom protocol over raw TCP connections. +**SnowMint** is a high-performance, distributed unique ID generator based on X's Snowflake algorithm. It provides unique, roughly sortable IDs that are generated using a client/server model with a custom protocol. -## Unique ID Generation +## SnowMint IDs -### Algorithm +### ID Generation Algorithm +SnowMint leverages X’s Snowflake algorithm to generate **signed, non-negative 64-bit unique identifiers called SnowMint IDs**. The format of the ID consists of: -A SnowMint generates IDs using a 64-bit timestamp, a machine ID, a process ID, and a sequence number. This combination ensures uniqueness and allows for efficient sorting based on the timestamp. +- **Sign**: A 1-bit field. It will always be 0. +- **Timestamp**: A 43-bit field representing the current time in milliseconds since your organization's epoch. This allows for generating 278 years 11 months 2.049 days worth of unique IDs from the epoch. +- **DataCenter ID**: A 5-bit field identifying the data center where the ID was generated. +- **Node ID**: A 5-bit field identifying the machine where the ID was generated. +- **Sequence**: A 10-bit field for a sequence number that resets every millisecond. -B SnowMint leverages Twitter’s Snowflake algorithm to generate 64-bit unique identifiers. The format of the ID consists of: -- **Timestamp**: 41 bits for time in milliseconds. -- **Node ID**: 10 bits for machine or datacenter ID. -- **Sequence**: 12 bits for a sequence number that resets every millisecond. +This combination ensures that SnowMint can (theoretically) generate more than 1,000,000 unique IDs per second, even in distributed environments. -Timestamp: A 41-bit field representing the current time in milliseconds since the epoch. -Machine ID: A 10-bit field identifying the machine where the ID was generated. -Process ID: A 5-bit field representing the process ID. -Sequence Number: A 10-bit field for sequential IDs within a millisecond. +### An Example +The SnowMint ID `817347092935625747` is generated by a SnowMint server with the following (configurable) parameters: +- Epoch: 946684800000 (Saturday, January 01 2000 00:00:00.00 GMT+0000) +- DataCenter ID: 0 +- Node ID: 19 -This combination ensures that SnowMint can generate thousands of unique IDs per second, even in distributed environments. +The binary representation of this ID is presented below. Note that for this particular ID, the timestamp is `1726167730122 (Thursday, September 12, 2024 19:02:10.122 GMT+0000)` and the sequence number is equal to `19`. + +``` +0 0001011010101111100110011011001101111001010 00000 10011 0000010011 + 62 19 14 9 0 +``` ### Sorting IDs -Since the first 41 bits represent the timestamp, SnowMint IDs are naturally sortable by creation time. IDs generated earlier will have a smaller numeric value than those generated later, allowing simple chronological ordering by comparing ID values directly. +Since the first significant 43 bits represent the timestamp, SnowMint IDs are naturally sortable by creation time. IDs generated earlier will have a smaller numeric value than those generated later, allowing simple chronological ordering by comparing ID values directly. You can easily retreieve this timestamp by the following formula: +``` +(SnowMintID >> 20) + EPOCH +``` ### Time Synchronization in Distributed Systems -In distributed systems, it is **crucial** that all nodes maintain synchronized clocks to ensure the uniqueness of the IDs. Since the Snowflake algorithm heavily relies on the system timestamp (41 bits of the ID represent the time), any drift in a node's clock can lead to the generation of duplicate IDs, which breaks the uniqueness guarantee. +In distributed systems, it is **crucial** that all nodes maintain synchronized clocks to ensure the uniqueness of the IDs. Since the SnowMint algorithm heavily relies on the system timestamp (43 bits of the ID represent the time), any drift in a node's clock can lead to the generation of duplicate IDs, which breaks the uniqueness guarantee. To avoid this issue: - **Synchronize Time Across Nodes**: Use tools like **NTP (Network Time Protocol)** or similar to keep system clocks in sync. @@ -79,40 +61,61 @@ This minimalist protocol reduces overhead, delivering unparalleled speed compare - **Raw TCP**: Eliminates HTTP headers and other overhead, reducing the time between a request and response. - **Low-Latency**: Designed for microsecond-scale latencies, making it ideal for high-throughput systems. -## Clients +## Install -SnowMint provides easy-to-use SDKs for popular programming languages to integrate with the server and retrieve unique IDs. +The SnowMint server accepts the following optional commandline flags: -### Go SDK -The Go client SDK allows seamless integration into Go applications. A simple GET request over TCP fetches the unique ID. +`--address` The address for the server to bind to (default: localhost) -### Java SDK -The Java SDK offers a similarly efficient way to connect to the SnowMint server, providing support for applications in JVM environments. +`--port` The port for the server to bind to (default: 8080) -## Install +`--datacenter` The server's data center ID between 0 and 31 (default: 0) + +`--node` The server's node ID between 0 and 31 (default: 0) + +`--epoch` Your organization's epoch in milliseconds (default: 0 _Wednesday, December 31, 1969 7:00:00 PM_) + +Note that in distributed systems, it is **crucial** that all instances of SnowMint servers are started with the same `epoch`. ### Native Deployment -1. Download the latest release from the [SnowMint releases page](#). -2. Extract the archive and run the binary: +1. Download the latest release from the [SnowMint releases page](#) (or alternatively, you can clone this repository and build the binary yourself with `go build`). +2. Extract the archive and run the binary with your desired flags; e.g.: ```bash - ./snowmint-server --node-id + ./snowmint --node= ``` ### Docker Deployment -To run SnowMint in a Docker container, use the following: +To run SnowMint in a Docker container, simply use the following with your desired flags; e.g.: +```bash +docker pull mxmlkzdh/snowmint:latest +docker run -d --name snowmint -p 8080:8080 mxmlkzdh/snowmint --node= +``` + +### Generate Your First SnowMint ID +Once an instance of the SnowMint server is up and running, execute the following command in your favorite terminal emulator and you'll receive a newly minted SnowMint ID! ```bash -docker pull snowmint/snowmint-server:latest -docker run -d --name snowmint -p 8080:8080 snowmint/snowmint-server --node-id +echo GET | nc localhost 8080 ``` +## Clients + +SnowMint provides easy-to-use SDKs for popular programming languages to integrate with your system and retrieve unique IDs. + +| Language | Description | +| ------------- | ------------- | +| [Go SDK](https://github.com/mxmlkzdh/snowmint-go) | The Go client SDK allows seamless integration into Go applications. A simple GET request over TCP fetches the unique ID. | +| [Java SDK](https://github.com/mxmlkzdh/snowmint-java) | The Java SDK offers a similarly efficient way to connect to the SnowMint server, providing support for applications in JVM environments. | + ## Benchmarks SnowMint has been benchmarked to handle thousands of requests per second, with latencies in the microsecond range. Thanks to the custom protocol and raw TCP connections, it outperforms traditional HTTP-based systems by a significant margin. -- **ID Generation Rate**: Up to X,000 IDs per second per node. +- **ID Generation Rate**: More than 100,000 IDs per second per node. - **Latency**: Sub-millisecond, typically under X microseconds. + - In a native deployment, response time for each unique ID is roughly between 5 to 10 microseconds. + - In a Docker container, response time for each unique ID is roughly between 10 to 15 microseconds. ## Sources -The SnowMint project is open source and available on GitHub. Check out the [source code](#) to contribute or explore the internals of the system. +- [Snowflake ID](https://en.wikipedia.org/wiki/Snowflake_ID) ## License -The SnowMint project is licensed under the MIT License. \ No newline at end of file +The SnowMint project is licensed under the [MIT License](LICENSE). From 738a5b8b2a3ac3fe2fa6d6c7946e470c265b9b12 Mon Sep 17 00:00:00 2001 From: Max Malekzadeh <11231195+mxmlkzdh@users.noreply.github.com> Date: Thu, 12 Sep 2024 16:09:25 -0400 Subject: [PATCH 4/4] Update command line flags --- config/config.go | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/config/config.go b/config/config.go index 790e125..6ef3b3a 100644 --- a/config/config.go +++ b/config/config.go @@ -15,8 +15,8 @@ type Config struct { func LoadConfig() *Config { address := flag.String("address", "localhost", "The address to bind to.") port := flag.Int("port", 8080, "The port to bind to.") - dataCenterID := flag.Int("dataCenterID", 0, "The data center ID.") - nodeID := flag.Int("nodeID", 0, "The node ID.") + dataCenterID := flag.Int("datacenter", 0, "The data center ID.") + nodeID := flag.Int("node", 0, "The node ID.") epoch := flag.Int("epoch", 0, "The epoch in milliseconds.") flag.Parse() return &Config{