Minor cleanup of user guide text

pola-rs · Oct 6, 2023 · 1619420 · 1619420
1 parent 126db7d
commit 1619420
Show file tree

Hide file tree

Showing 3 changed files with 16 additions and 12 deletions.
diff --git a/docs/user-guide/io/cloud-storage.md b/docs/user-guide/io/cloud-storage.md
@@ -3,6 +3,7 @@
 Polars can read and write to AWS S3, Azure Blob Storage and Google Cloud Storage. The API is the same for all three storage providers.
 
 To read from cloud storage, additional dependencies may be needed depending on the use case and cloud storage provider:
+
 === ":fontawesome-brands-python: Python"
 
     ```shell
@@ -31,7 +32,7 @@ Polars can scan a Parquet file in lazy mode from cloud storage. We may need to p
 
 This query creates a `LazyFrame` without downloading the file. In the `LazyFrame` we have access to file metadata such as the schema. Polars uses the `object_store.rs` library internally to manage the interface with the cloud storage providers and so no extra dependencies are required in Python to scan a cloud Parquet file.
 
-If we create a lazy query with [predicate and projection pushdowns](../lazy/optimizations.md) the query optimiser will apply them before the file is downloaded. This can significantly reduce the amount of data that needs to be downloaded. The query evaluation is triggered by calling `collect`.
+If we create a lazy query with [predicate and projection pushdowns](../lazy/optimizations.md), the query optimiszr will apply them before the file is downloaded. This can significantly reduce the amount of data that needs to be downloaded. The query evaluation is triggered by calling `collect`.
 
 {{code_block('user-guide/io/cloud-storage','scan_parquet_query',[])}}
 
@@ -40,10 +41,11 @@ If we create a lazy query with [predicate and projection pushdowns](../lazy/opti
 We can also scan from cloud storage using PyArrow. This is particularly useful for partitioned datasets such as Hive partitioning.
 
 We first create a PyArrow dataset and then create a `LazyFrame` from the dataset.
+
 {{code_block('user-guide/io/cloud-storage','scan_pyarrow_dataset',[scan_pyarrow_dataset])}}
 
 ## Writing to cloud storage
 
-We can write a `DataFrame` to cloud storage in Python using s3fs for S3, adlfs for Azure Blob Storage and gcsfs for Google Cloud Storage. In this example we write a Parquet file to S3.
+We can write a `DataFrame` to cloud storage in Python using s3fs for S3, adlfs for Azure Blob Storage and gcsfs for Google Cloud Storage. In this example, we write a Parquet file to S3.
 
 {{code_block('user-guide/io/cloud-storage','write_parquet',[write_parquet])}}
diff --git a/docs/user-guide/io/database.md b/docs/user-guide/io/database.md
@@ -2,24 +2,25 @@
 
 ## Read from a database
 
-Polars can read from a database using either the `pl.read_database_uri` and `pl.read_database` functions.
+Polars can read from a database using the `pl.read_database_uri` and `pl.read_database` functions.
 
-### Difference between the `read_database` functions
+### Difference between `read_database_uri` and `read_database`
 
 Use `pl.read_database_uri` if you want to specify the database connection with a connection string called a `uri`. For example, the following snippet shows a query to read all columns from the `foo` table in a Postgres database where we use the `uri` to connect:
 
 {{code_block('user-guide/io/database','read_uri',['read_database_uri'])}}
 
-On the other hand use `pl.read_database` if you want to connect via a connection engine created with a library like SQLAlchemy.
+On the other hand, use `pl.read_database` if you want to connect via a connection engine created with a library like SQLAlchemy.
+
 {{code_block('user-guide/io/database','read_cursor',['read_database'])}}
 
 Note that `pl.read_database_uri` is likely to be faster than `pl.read_database` if you are using a SQLAlchemy or DBAPI2 connection as these connections may load the data row-wise into Python before copying the data again to the column-wise Apache Arrow format.
 
 ### Engines
 
-Polars doesn't manage connections and data transfer from databases by itself. Instead external libraries (known as _engines_) handle this.
+Polars doesn't manage connections and data transfer from databases by itself. Instead, external libraries (known as _engines_) handle this.
 
-If you use `pl.read_database` then you specify the engine when you make the connection object. If you use `pl.read_database_uri` then you can specify one of two engines to read from the database:
+When using `pl.read_database`, you specify the engine when you create the connection object. When using `pl.read_database_uri`, you can specify one of two engines to read from the database:
 
 - [ConnectorX](https://github.com/sfu-db/connector-x) and
 - [ADBC](https://arrow.apache.org/docs/format/ADBC.html)
@@ -46,7 +47,7 @@ It is still early days for ADBC so support for different databases is still limi
 $ pip install adbc-driver-sqlite
 ```
 
-As ADBC is not the default engine you must specify the engine as an argument to `pl.read_database`
+As ADBC is not the default engine you must specify the engine as an argument to `pl.read_database_uri`
 
 {{code_block('user-guide/io/database','adbc',['read_database_uri'])}}
 
@@ -73,9 +74,10 @@ In this example, we write the `DataFrame` to a table called `records` in the dat
 
 {{code_block('user-guide/io/database','write',['write_database'])}}
 
-In the SQLAlchemy approach Polars converts the `DataFrame` to a Pandas `DataFrame` backed by PyArrow and then uses SQLAlchemy methods on a Pandas `DataFrame` to write to the database.
+In the SQLAlchemy approach, Polars converts the `DataFrame` to a Pandas `DataFrame` backed by PyArrow and then uses SQLAlchemy methods on a Pandas `DataFrame` to write to the database.
 
 #### ADBC
 
-As with reading from a database you can also use ADBC to write to a SQLite or Posgres database. As shown above you need to install the appropriate ADBC driver for your database.
+As with reading from a database, you can also use ADBC to write to a SQLite or Posgres database. As shown above, you need to install the appropriate ADBC driver for your database.
+
 {{code_block('user-guide/io/database','write_adbc',['write_database'])}}
diff --git a/docs/user-guide/io/json_file.md b/docs/user-guide/io/json_file.md
@@ -14,15 +14,15 @@ Reading a JSON file should look familiar:
 
 JSON objects that are delimited by newlines can be read into polars in a much more performant way than standard json.
 
-Polars can read an ND-JSON file into a `DataFrame` using the `read_ndjson` function:
+Polars can read an NDJSON file into a `DataFrame` using the `read_ndjson` function:
 
 {{code_block('user-guide/io/json-file','readnd',['read_ndjson'])}}
 
 ## Write
 
 {{code_block('user-guide/io/json-file','write',['write_json','write_ndjson'])}}
 
-## Scan NDJSON
+## Scan
 
 `Polars` allows you to _scan_ a JSON input **only for newline delimited json**. Scanning delays the actual parsing of the
 file and instead returns a lazy computation holder called a `LazyFrame`.