Sourcegraph Data Center is configured by applying Kubernetes YAML files and simple kubectl
commands.
Since everything is vanilla Kubernetes, you can configure Sourcegraph as flexibly as you need to meet the requirements of your deployment environment. We provide simple instructions for common things like setting up TLS, enabling code intelligence, and exposing Sourcegraph to external traffic below.
We recommend you fork this repository to track your configuration changes in Git. This will make upgrades far easier and is a good practice not just for Sourcegraph, but for any Kubernetes application.
-
Create a fork of this repository.
- The fork can be public unless you plan to store secrets in the repository itself.
- We recommend not storing secrets in the repository itself and these instructions document how.
-
Create a release branch to track all of your customizations to Sourcegraph. When you upgrade Sourcegraph Data Center, you will merge upstream into this branch.
git checkout HEAD -b release
If you followed the installation instructions,
HEAD
should point at the Git tag you've deployed to your running Kubernetes cluster. -
Commit customizations to your release branch:
- Commit manual modifications to Kubernetes YAML files.
- Commit commands that should be run on every update (e.g.
kubectl apply
) to ./kubectl-apply-all.sh. - Commit commands that generally only need to be run once per cluster to (e.g.
kubectl create secret
,kubectl expose
) to ./create-new-cluster.sh.
Configuration steps in this file depend on jq, yj and jy.
- Configure a storage class
- Configure network access
- Update site configuration
- Configure TLS/SSL
- Configure repository cloning via SSH
- Configure language servers
- Configure SSDs to boost performance.
- Increase memory or CPU limits
- Configure gitserver replica count
- Assign resource-hungry pods to larger nodes
- Configure Prometheus
- Configure Jaeger tracing
- Configure Lightstep tracing
- Configure custom Redis
- Configure custom PostgreSQL
- Install without RBAC
You need to make the main web server accessible over the network to external users.
There are a few approaches, but using an ingress controller is recommended.
For production environments, we recommend using the ingress-nginx ingress.
As part of our base configuration we install an ingress for sourcegraph-frontend. It installs rules for the default ingress, see comments to restrict it to a specific host.
In addition to the sourcegraph-frontend ingress, you'll need to install the NGINX ingress controller (ingress-nginx). Follow the instructions at https://kubernetes.github.io/ingress-nginx/deploy/ to create the ingress controller. Add the files to configure/ingress-nginx, including an install.sh file which applies the relevant manifests. We include sample generic-cloud manifests as part of this repository, but please follow the official instructions for your cloud provider.
Add the configure/ingress-nginx/install.sh command to create-new-cluster.sh and commit the change:
echo ./configure/ingress-nginx/install.sh >> create-new-cluster.sh
Once the ingress has acquired an external address, you should be able to access Sourcegraph using that. You can check the external address by running the following command and looking for the LoadBalancer
entry:
kubectl -n ingress-nginx get svc
If you are having trouble accessing Sourcegraph, ensure ingress-nginx IP is accessible above. Otherwise see Troubleshooting ingress-nginx. The namespace of the ingress-controller is ingress-nginx
.
ingress-nginx
has extensive configuration documented at NGINX Configuration. We expect most administrators to modify ingress-nginx annotations in sourcegraph-frontend.Ingress.yaml. Some settings are modified globally (such as HSTS). In that case we expect administrators to modify the ingress-nginx configmap in configure/ingress-nginx/mandatory.yaml.
In cases where ingress controllers cannot be created, creating an explicit NGINX service is a viable alternative. See the files in the configure/nginx-svc folder for an example of how to do this via a NodePort service (any other type of Kubernetes service will also work):
-
Modify configure/nginx-svc/nginx.ConfigMap.yaml to contain the TLS certificate and key for your domain.
-
kubectl apply -f configure/nginx-svc
to create the NGINX service. -
Update create-new-cluster.sh with the previous command.
echo kubectl apply -f configure/nginx-svc >> create-new-cluster.sh
Note: this setup path does not support TLS.
Add a network rule that allows ingress traffic to port 30080 (HTTP) on at least one node.
-
Google Cloud Platform Firewall rules.
-
Expose the necessary ports.
gcloud compute --project=$PROJECT firewall-rules create sourcegraph-frontend-http --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:30080
-
Change the type of the
sourcegraph-frontend
service in base/frontend/sourcegraph-frontend.Service.yaml fromClusterIP
toNodePort
:spec: ports: - name: http port: 30080 + nodePort: 30080 - type: ClusterIP + type: NodePort
-
Directly applying this change to the service will fail. Instead, you must delete the old service and then create the new one (this will result in a few seconds of downtime):
kubectl delete svc sourcegraph-frontend kubectl apply -f base/frontend/sourcegraph-frontend.Service.yaml
-
Find a node name.
kubectl get pods -l app=sourcegraph-frontend -o=custom-columns=NODE:.spec.nodeName
-
Get the EXTERNAL-IP address (will be ephemeral unless you make it static).
kubectl get node $NODE -o wide
-
Sourcegraph should now be accessible at $EXTERNAL_ADDR:30080
, where $EXTERNAL_ADDR
is the address of any node in the cluster.
The site configuration is stored inside a ConfigMap, which is mounted inside every deployment that needs it. You can change the site configuration by editing base/config-file.ConfigMap.yaml.
Updates to the site configuration are propagated to the relevant services in about 1 minute. (Future Kubernetes versions will decrease this latency.)
For the impatient, site configuration changes can be applied immediately by changing the name of the ConfigMap. kubectl apply
ing these changes will force the relevant pods to restart immediately with the new config:
-
Change the name of the ConfigMap in all deployments.
The following convenience script changes the name of the site configuration's ConfigMap (and all references to it) by appending the current date and time. This script should be run at the root of your
deploy-sourcegraph-$VERSION
folder.#!/bin/bash # e.g. 2018-08-15t23-42-08z CONFIG_DATE=$(date -u +"%Y-%m-%dt%H-%M-%Sz") # update all references to the site config's ConfigMap # from: 'config-file.*' , to:' config-file-$CONFIG_DATE' find . -name "*yaml" -exec sed -i.sedibak -e "s/name: config-file.*/name: config-file-$CONFIG_DATE/g" {} + # delete sed's backup files find . -name "*.sedibak" -delete
-
Apply the new configuration to your Kubernetes cluster.
./kubectl-apply-all.sh
If you intend to make your Sourcegraph instance accessible on the Internet or another untrusted network, you should use TLS so that all traffic will be served over HTTPS.
If you exposed your Sourcegraph instance via an ingress controller as described in "Ingress controller (recommended)":
-
Create a TLS secret that contains your TLS certificate and private key.
kubectl create secret tls sourcegraph-tls --key $PATH_TO_KEY --cert $PATH_TO_CERT
Update create-new-cluster.sh with the previous command.
echo kubectl create secret tls sourcegraph-tls --key $PATH_TO_KEY --cert $PATH_TO_CERT >> create-new-cluster.sh
-
Add the tls configuration to base/frontend/sourcegraph-frontend.Ingress.yaml.
# base/frontend/sourcegraph-frontend.Ingress.yaml tls: - hosts: - example.sourcegraph.com secretName: sourcegraph-tls
Convenience script:
# This script requires https://github.com/sourcegraph/jy and https://github.com/sourcegraph/yj EXTERNAL_URL=example.sourcegraph.com FE=base/frontend/sourcegraph-frontend.Ingress.yaml cat $FE | yj | jq --arg host ${EXTERNAL_URL} '.spec.tls += {hosts: [$host], secretName: "sourcegraph-tls"}' | jy -o $FE
-
Change your
externalURL
in the site configuration stored inbase/config-file.ConfigMap.yaml
.{ "externalURL": "https://example.sourcegraph.com" // Must begin with "https"; replace with the public IP or hostname of your machine }
-
Deploy the changes by following the instructions to update to the site configuration.
WARNING: Do NOT commit the actual TLS cert and key files to your fork (unless your fork is private and you are okay with storing secrets in it).
If you exposed your Sourcegraph instance via the altenative nginx service as described in "nginx service", those instructions already walked you through setting up TLS/SSL.
Sourcegraph will clone repositories using SSH credentials if they are mounted at /root/.ssh
in the gitserver
deployment.
-
Create a secret that contains the base64 encoded contents of your SSH private key (make sure it doesn't require a password) and known_hosts file.
kubectl create secret generic gitserver-ssh \ --from-file id_rsa=${HOME}/.ssh/id_rsa \ --from-file known_hosts=${HOME}/.ssh/known_hosts
Update create-new-cluster.sh with the previous command.
echo kubectl create secret generic gitserver-ssh \ --from-file id_rsa=${HOME}/.ssh/id_rsa \ --from-file known_hosts=${HOME}/.ssh/known_hosts >> create-new-cluster.sh
-
Mount the secret as a volume in gitserver.StatefulSet.yaml.
For example:
# base/gitserver/gitserver.StatefulSet.yaml spec: containers: volumeMounts: - mountPath: /root/.ssh name: ssh volumes: - name: ssh secret: defaultMode: 384 secretName: gitserver-ssh
Convenience script:
# This script requires https://github.com/sourcegraph/jy and https://github.com/sourcegraph/yj GS=base/gitserver/gitserver.StatefulSet.yaml cat $GS | yj | jq '.spec.template.spec.containers[].volumeMounts += [{mountPath: "/root/.ssh", name: "ssh"}]' | jy -o $GS cat $GS | yj | jq '.spec.template.spec.volumes += [{name: "ssh", secret: {defaultMode: 384, secretName:"gitserver-ssh"}}]' | jy -o $GS
-
Apply the updated
gitserver
configuration to your cluster../kubectl-apply-all.sh
WARNING: Do NOT commit the actual id_rsa
and known_hosts
files to your fork (unless
your fork is private and you are okay with storing secrets in it).
Code intelligence is provided through Sourcegraph extensions. Refer to the READMEs for each language for instructions about how to deploy and configure them:
- Go
- JavaScript/TypeScript
- Python
- ... check the extension registry for more (e.g. Java) or create a new extension
If your instance contains a large number of repositories or monorepos, changing the compute resources allocated to containers can improve performance. See Kubernetes' official documentation for information about compute resources and how to specify then, and see docs/scale.md for specific advice about what resources to tune.
Note: If you're creating a new cluster and would like to change gitserver
's replica count, do
so before running ./kubectl-apply-all.sh
for the first time. Changing this after the cluster
configuration has been applied will require manually resizing the indexed-search
volume.
Increasing the number of gitserver
replicas can improve performance when your instance contains a large number of repositories. Repository clones are consistently striped across all gitserver
replicas. Other services need to be aware of how many gitserver
replicas exist so they can resolve an individual repo.
To change the number of gitserver
replicas:
-
Update the
replicas
field in gitserver.StatefulSet.yaml. -
Update the
SRC_GIT_SERVERS
environment variable in the frontend service to reflect the number of replicas.For example, if there are 2 gitservers then
SRC_GIT_SERVERS
should have the valuegitserver-0.gitserver:3178 gitserver-1.gitserver:3178
:- env: - name: SRC_GIT_SERVERS value: gitserver-0.gitserver:3178 gitserver-1.gitserver:3178
-
Update the requested
storage
capacity in base/indexed-search/indexed-search.PersistentVolumeClaim.yaml to be200Gi
multiplied by the number ofgitserver
replicas.For example, if there are 2
gitserver
replicas then thestorage
requested in base/indexed-search/indexed-search.PersistentVolumeClaim.yaml should have the value400Gi
.# base/indexed-search/indexed-search.PersistentVolumeClaim.yaml spec: resources: requests: storage: 400Gi
Here is a convenience script that performs all three steps:
# This script requires https://github.com/sourcegraph/jy and https://github.com/sourcegraph/yj
GS=base/gitserver/gitserver.StatefulSet.yaml
REPLICA_COUNT=2 # number of gitserver replicas
# Update gitserver replica count
cat $GS | yj | jq ".spec.replicas = $REPLICA_COUNT" | jy -o $GS
# Compute all gitserver names
GITSERVERS=$(for i in `seq 0 $(($REPLICA_COUNT-1))`; do echo -n "gitserver-$i.gitserver:3178 "; done)
# Update SRC_GIT_SERVERS environment variable in other services
find . -name "*yaml" -exec sed -i.sedibak -e "s/value: gitserver-0.gitserver:3178.*/value: $GITSERVERS/g" {} +
IDX_SEARCH=base/indexed-search/indexed-search.PersistentVolumeClaim.yaml
# Update the storage requested in indexed-search's persistent volume claim
cat $IDX_SEARCH | yj | jq --arg REPLICA_COUNT "$REPLICA_COUNT" '.spec.resources.requests.storage = ((($REPLICA_COUNT |tonumber) * 200) | tostring)+"Gi"' | jy -o $IDX_SEARCH
# Delete sed's backup files
find . -name "*.sedibak" -delete
Commit the outstanding changes.
If you have a heterogeneous cluster where you need to ensure certain more resource-hungry pods are assigned to more powerful nodes (e.g. indexedSearch
), you can specify node constraints (such as nodeSelector
, etc.).
This is useful if, for example, you have a very large monorepo that performs best when gitserver
and searcher
are on very large nodes, but you want to use smaller nodes for
sourcegraph-frontend
, repo-updater
, etc. Node constraints can also be useful to ensure fast
updates by ensuring certain pods are assigned to specific nodes, preventing the need for manual pod
shuffling.
See the official documentation for instructions about applying node constraints.
Sourcegraph expects there to be storage class named sourcegraph
that it uses for all its persistent volume claims. This storage class must be configured before applying the base configuration to your cluster.
Create base/sourcegraph.StorageClass.yaml
with the appropriate configuration for your cloud provider and commit the file to your fork.
# base/sourcegraph.StorageClass.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: sourcegraph
labels:
deploy: sourcegraph
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd # This configures SSDs (recommended).
# base/sourcegraph.StorageClass.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: sourcegraph
labels:
deploy: sourcegraph
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2 # This configures SSDs (recommended).
# base/sourcegraph.StorageClass.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: sourcegraph
labels:
deploy: sourcegraph
provisioner: kubernetes.io/azure-disk
parameters:
storageaccounttype: Premium_LRS # This configures SSDs (recommended). A Premium VM is required.
# base/sourcegraph.StorageClass.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: sourcegraph
labels:
deploy: sourcegraph
# Read https://kubernetes.io/docs/concepts/storage/storage-classes/ to configure the "provisioner" and "parameters" fields for your cloud provider.
# SSDs are highly recommended!
# provisioner:
# parameters:
If you wish to use a different storage class for Sourcegraph, then you need to update all persistent volume claims with the name of the desired storage class. Convenience script:
#!/bin/bash
# This script requires https://github.com/sourcegraph/jy and https://github.com/sourcegraph/yj
STORAGE_CLASS_NAME=
find . -name "*PersistentVolumeClaim.yaml" -exec sh -c "cat {} | yj | jq '.spec.storageClassName = \"$STORAGE_CLASS_NAME\"' | jy -o {}" \;
GS=base/gitserver/gitserver.StatefulSet.yaml
cat $GS | yj | jq --arg STORAGE_CLASS_NAME $STORAGE_CLASS_NAME '.spec.volumeClaimTemplates = (.spec.volumeClaimTemplates | map( . * {spec:{storageClassName: $STORAGE_CLASS_NAME }}))' | jy -o $GS
Lightstep is a closed-source distributed tracing and performance monitoring tool created by some of the authors of Dapper. Every Sourcegraph deployment supports Lightstep, and it can be configured via the following environment variables (with example values):
env:
# https://about.sourcegraph.com/docs/config/site/#lightstepproject-string
- name: LIGHTSTEP_PROJECT
value: my_project
# https://about.sourcegraph.com/docs/config/site/#lightstepaccesstoken-string
- name: LIGHTSTEP_ACCESS_TOKEN
value: abcdefg
# If false, any logs (https://github.com/opentracing/specification/blob/master/specification.md#log-structured-data)
# from spans will be omitted from the spans sent to Lightstep.
- name: LIGHTSTEP_INCLUDE_SENSITIVE
value: true
To enable this, you must first purchase Lightstep and create a project corresponding to the Sourcegraph instance. Then, add the above environment to each deployment.
Sourcegraph supports specifying a custom Redis server for:
- caching information (specified via the
REDIS_CACHE_ENDPOINT
environment variable) - storing information (session data) (specified via the
REDIS_STORE_ENDPOINT
environment variable)
If you want to specify a custom Redis server, you'll need specify the corresponding environment variable for each of the following deployments:
sourcegraph-frontend
repo-updater
You may prefer to configure Sourcegraph to store data in an external PostgreSQL instance if you already have existing database management or backup infrastructure.
Simply edit the relevant PostgreSQL environment variables (e.g. PGHOST, PGPORT, PGUSER, etc.) in base/frontend/sourcegraph-frontend.Deployment.yaml to point to your existing PostgreSQL instance.
Sourcegraph Data Center communicates with the Kubernetes API for service discovery. It also has some janitor DaemonSets that clean up temporary cache data. To do that we need to create RBAC resources.
If using RBAC is not an option, then you will not want to apply *.Role.yaml
and *.RoleBinding.yaml
files.
Beginning in version 2.12.0, Sourcegraph's Kubernetes deployment requires an Enterprise license key.
-
Create an account on or sign in to sourcegraph.com, and go to https://sourcegraph.com/users/subscriptions/new to buy a license key.
-
Once you have a license key, add it to your configuration by editing
base/config-file.ConfigMap.yaml
.
# base/config-file.ConfigMap.yaml
config.json: |-
{
"licenseKey": "YOUR_LICENSE_KEY"
}
- Run
./kubectl-apply-all.sh
to apply the changes to your cluster.