Recently I’ve become responsible for a number of GitHub organizations with many repositories in each. During a review, I noticed a Slack webhook URL saved in a config file in GitHub. Slack webhooks are secrets as anyone with the webhook URL can push a message into Slack. Slack has this to say about them:
Fortunately, all these secrets were found in private GitHub repos, so they were never exposed publicly. The Slack webhook URL that was found had been stored in the config file for a year when discovered. So it seems there aren’t any internal processes actively monitoring for secrets in source code, or they don’t consider a Slack webhook URL an issue. I was curious what other secrets I might find, so I download the trufflehog CLI and decided to start scanning.
Trufflehog is a tool that is designed to find secrets by scanning for high entropy strings in source code. You can learn more about the tool here:
https://github.com/trufflesecurity/trufflehog
It’s quite easy to scan an entire GitHub organization, all it requires is a single command:
trufflehog github --endpoint https://github.com --org my-org --token my-token -j --only-verified
If trufflehog finds secrets in your repositories it will return a result similar to what is shown below:
{"SourceMetadata":{"Data":{"Github":{"link":"https://github.com/org/repo/file.yml","repository":"https://github.com/org/repo.git","commit":"commit-hash","email":"user@test.com","file":"file.yml","timestamp":"2023-01-17 09:50:57 -0500 -0500","line":40,"visibility":1}}},"SourceID":0,"SourceType":7,"SourceName":"trufflehog - github","DetectorType":30,"DetectorName":"SlackWebhook","DecoderName":"PLAIN","Verified":true,"Raw":"https://hooks.slack.com/fake/webhook/url","Redacted":"","ExtraData":null,"StructuredData":null}
After trufflehog finished scanning, it had identified a number of secrets in a few repositories. The most common issue however, were Slack webhook URLs embedded in config files. Once identified, I moved these secrets to a HashiCorp Vault instance and updated the config files to remove the Slack webhook. It’s not enough to just update the file to remove the secret, you also need to rewrite the entire history to remove it completely. To do this, you can use a tool like BFG Repo-Cleaner.
A few days later, I decided to automate the secret scanning process using trufflehog and the Concourse CI tool. Concourse CI is known as a “continuous thing doer” and I’ve used it for CI/CD for a number of years. You can read more about it here:
I wanted my secret scanning pipeline to run every night to warn us if any secrets were found. Here is a simplified version of the pipeline I developed:
resource_types:
- name: slack-notification
type: registry-image
source:
repository: cfcommunity/slack-notification-resource
tag: latest
resources:
- name: once-nightly
type: time
icon: clock
check_every: 2h
source:
start: 1:00 AM
stop: 4:00 PM
days: [Monday, Tuesday, Wednesday, Thursday, Friday]
location: America/New_York
- name: slack-notify
type: slack-notification
icon: slack
source:
url: ((slack-webhook))
- name: trufflehog-image
type: registry-image
icon: docker
source:
repository: ((docker-repository))
username: ((docker-username))
password: ((docker-password))
jobs:
- name: scan-organizations
plan:
- get: once-nightly
trigger: true
- get: trufflehog-image
trigger: true
- task: trufflehog-scan-org
image: trufflehog-image
config:
platform: linux
outputs:
- name: results
run:
path: bash
args:
- -exc
- |
# Scan the organization to find any secrets:
trufflehog github --endpoint https://github.com --org org --token ${GIT_TOKEN} -j --only-verified >> result.json
# Check the number of lines in the file to determine how many secrets were found:
jq -s length result.json
issues=`jq -s length result.json`
# Write a message to a file that will be sent to notify the team secrets were found:
echo "Trufflehog found $issues verified issues in the organization." >> results/org.txt
params:
GIT_TOKEN: ((git.access-token))
on_success:
do:
- put: slack-notify
params:
username: Concourse
silent: true
text_file: results/org.txt
text: |
:concourse-succeeded: [*SUCCESS*] *$BUILD_PIPELINE_NAME* | *$BUILD_JOB_NAME*
Result: $TEXT_FILE_CONTENT
on_failure:
do:
- put: slack-notify
params:
username: Concourse
silent: true
text: |
The scanning pipeline failed.
The Concourse pipeline is fairly simple, it sets up a few resources and the main job of the pipeline scans the specified organization’s repositories for any secrets. The results of the scan are written to a file and inspected to identify how many issues were found. Not shown in the simplified pipeline above, but the job also stores the report in S3 so the developers can take action on any secrets found. The pipeline also scans multiple GitHub organizations in parallel, but that is not shown in the simplified pipeline above.
A Docker image was also created that contains the trufflehog CLI and other tools required by the pipeline. That Dockerfile looks something like this:
FROM golang:alpine
RUN apk update && apk add git jq curl gcc bash musl-dev openssl-dev ca-certificates && update-ca-certificates
RUN apk add --no-cache aws-cli
RUN git clone https://github.com/trufflesecurity/trufflehog.git
RUN cd trufflehog && go install
If you are not currently scanning source code for passwords, I would highly recommend setting up a system like this to continually scan. If you have public repos, some companies like Stripe and Slack will scan repositories for their compromised keys. Even GitHub has secret scanning capabilities for public repos, but it’s not available for private repos as far as I’ve seen. Especially if you’re using source control other than GitHub or an on-prem version, you should consider setting up something like this to catch any secrets. Better safe than sorry.