Skip to content

Commit

Permalink
Update script to do more of a full audit
Browse files Browse the repository at this point in the history
  • Loading branch information
Tim Stallmann committed Aug 26, 2022
1 parent 11d9b45 commit 94c890f
Show file tree
Hide file tree
Showing 3 changed files with 47 additions and 9 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@ client_secret.json
client_id.json
storage.json
env/
*.csv
venv/
.idea/
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,12 @@
google-drive-list-shared
========================

Supporting files for blog post: [Find All Shared Files in Google Drive with a Python Script](https://thornelabs.net/posts/find-all-shared-files-in-google-drive-with-a-python-script/).
This tool crawls through all the drive files you have access to and outputs a CSV auditing the permissions of each one.

Based on https://github.com/jameswthorne/google-drive-list-shared

To use:
1. Read through the blog post at [Find All Shared Files in Google Drive with a Python Script](https://thornelabs.net/posts/find-all-shared-files-in-google-drive-with-a-python-script/)
2. Set up a google API project and credentials as described in the blog post, download credentials to `client_id.json` in this directory.
3. If desired, set `email_address_to_audit` in the python file to something other than FALSE. This is mainly a convenience, it will add a new column to the CSV for true/false if that particular address has access to the file.
4. Run script. You will see a new CSV file in this directory with the results.
43 changes: 35 additions & 8 deletions google-drive-list-shared.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,13 @@
from httplib2 import Http
from oauth2client import file, client, tools

from datetime import date
from csv import DictWriter

email_to_audit = False
SCOPES = 'https://www.googleapis.com/auth/drive.readonly.metadata'
output_filename = f'drive_audit_results-{date.today().isoformat()}.csv'

store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
Expand All @@ -18,19 +24,21 @@
service = discovery.build('drive', 'v3', http=creds.authorize(Http()))
results = service.files().list(
pageSize=1000,
fields="nextPageToken, files(name, shared)").execute()
fields="nextPageToken, files(name, shared, permissions, webViewLink)").execute()
token = results.get('nextPageToken', None)
items = results.get('files', [])

while token is not None:
results = service.files().list(
pageSize=1000,
pageToken=token,
fields="nextPageToken, files(name, shared)").execute()
fields="nextPageToken, files(name, shared, permissions, webViewLink)").execute()

# Store the new nextPageToken on each loop iteration
token = results.get('nextPageToken', None)
# Append the next set of results to the items variable
items.extend(results.get('files', []))
print(f'Loaded {len(items)} files so far')

# The Google Drive does not return valid JSON because the property
# names are not enclosed in double quotes, they are enclosed in
Expand All @@ -39,10 +47,29 @@
items_dict = ast.literal_eval(str(items))

print("You have", len(items_dict), "files in Google Drive\n")
print("The following files are shared:\n")

# Iterate through the items list and only show files that have
# shared set to True.
for i in range(len(items_dict)):
if items_dict[i]['shared']:
print(items_dict[i]['name'])
with open(output_filename, 'w') as output_file:
# Iterate through the items list and only show files that have
# shared set to True.
fieldnames = ['name', 'link', 'shared', 'shared_publicly', 'shared_with']
if email_to_audit:
fieldnames.append('shared_with_audited_email')
writer = DictWriter(output_file, fieldnames=fieldnames)
writer.writeheader()
for i in range(len(items_dict)):
item = items_dict[i]
output_row = {
'name': item.get('name', ''),
'link': item.get('webViewLink', ''),
'shared': item.get('shared', ''),
'shared_with': ','.join([
p.get('emailAddress', p.get('displayName', p.get('type', '')))
for p in item.get('permissions', [])
])
}

output_row['shared_publicly'] = 'anyone' in output_row['shared_with']

if email_to_audit:
output_row['shared_with_audited_email'] = email_to_audit in output_row['shared_with']
writer.writerow(output_row)

0 comments on commit 94c890f

Please sign in to comment.