Personalized Code Searches Using OpenGrok
Improve your efficiency by deploying OpenGrok to provide code search capabilities on just the code you're interested in and not everything available.
Join the DZone community and get the full member experience.
Join For FreeMany organizations implement a full-featured code search/intelligence tool — e.g., SourceGraph or Atlassian Fisheye — allowing engineers to search the enterprise code base. Other orgs just use the native search of the hosted version control system, such as GitHub's or GitLab's search. There are other commercial and open sources ways to search code.
Most of us, however, do not regularly work on code spanning the enterprise and would prefer results related to our current activities. You may also need to search the open-source projects incorporated into your work. If this describes you, running OpenGrok on your laptop with a limited set of projects might be a solution.
If you've never heard of OpenGrok, it is a fully functioning Open Source code search/intelligence tool that works with many version control systems and multiple languages.
Prerequisites
These instructions were tested using Docker Desktop and should work with others — e.g., Colima — though slight adjustments might be necessary. A helpful page can be found here.
If necessary, install the docker runtime of your choice, i.e., Docker Desktop, Colima, Podman, containerd, etc.
This post uses Docker Desktop as the container runtime but should work with other container runtimes with some adjustments.
Create Docker Container
OpenGrok automatically pulls changes and rebuilds its indexes hourly by default.
While open-source projects usually do not restrict read-only access to their code, organizations secure their code to protect their intellectual property. To allow OpenGrok to access a private GitHub repository requires providing OpenGrok a valid SSH private key associated with your GitHub account. A customized Docker container is required to achieve this.
Clone OpenGrok
The entire OpenGrok code base is in GitHub. Clone the repo.
scsosna@mymachine src % git clone https://github.com/oracle/opengrok.git
Change Directory
Your current working directory must be the just-cloned directory.
scsosna@mymachine src % cd opengrok
Copy SSH Key
These instructions are for accessing a git repository via SSH authentication, such as GitHub.
Copy the private key associated with your GitHub into the OpenGrok directory from your .ssh directory. For this example, the file is named id_ed25519.
scsosna@mymachine opengrok % cp ~/.ssh/id_ed25519 .
Modify Dockerfile
Two changes are required to the provided Dockerfile:
- Remove unused version control systems: most common version control systems are supported. Unused ones may be removed from your rebuilt container;
- Add the SSH key: the SSH key copied into the OpenGrok repository needs to be explicitly included in the built container.
This git patch can be applied directly to the cloned OpenGrok repo.
index ae0b52d76e..e1e3771a1a 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -55,9 +55,12 @@ RUN echo 'deb http://package.perforce.com/apt/ubuntu bionic release' > /etc/apt/
# install dependencies and Python tools
# hadolint ignore=DL3008,DL3009
RUN apt-get update && \
- apt-get install --no-install-recommends -y git subversion mercurial cvs cssc bzr rcs rcs-blame helix-p4d \
+ apt-get install --no-install-recommends -y git \
unzip inotify-tools python3 python3-pip \
python3-venv python3-setuptools openssh-client
+RUN mkdir -p /root/.ssh
+COPY id_ed25519 /root/.ssh/id_ed25519
+RUN chmod 600 /root/.ssh
# compile and install universal-ctags
# hadolint ignore=DL3003,DL3008
Build
scsosna@mymachine opengrok % docker build -t opengrok .
Clone Repos
A fully-qualified directory path is provided when starting the container, which identifies the directory for OpenGrok to index.
I recommend using a dedicated, separate directory into which the repos of interest are cloned. OpenGrok processes each subdirectory as a separate project pulls recent changes, and reindexes.
scsosna@mymachine opengrok % cd ~/data/src
scsosna@mymachine src % mkdir repos
scsosna@mymachine src % cd repos
scsosna@mymachine repos % git clone git@github.com:spring-projects/spring-boot.git
scsosna@mymachine repos % git clone git@github.com:spring-projects/spring-framework.git
scsosna@mymachine repos % git clone git@github.com:spring-projects/spring-security.git
scsosna@mymachine repos % git clone git@github.com:square/retrofit.git
scsosna@mymachine repos % git clone git@github.com:scsosna99/neo4j-gradle-dependencies.git
Start OpenGrok
Run Container
A simple command provides the local directory in which the repos to the index are located. In my example, it’s /data/src/repos.
scsosna@mymachine repos % docker run -d -v ~/data/src/repos:/opengrok/src -p 8080:8080 opengrok
Note: the -p 8080:8080
maps the container port 8080 to machine port 8080. Choose another port if 8080 is unavailable, preferably a non-privileged port above 1023, such as -p 8080:1234
.
Update Known Hosts
Initially, your SSH key is not known or approved and needs to be confirmed. OpenGrok will attempt its git pull
but the command fails because the SSH key is still unrecognized. To access the container via the docker shell, you need to execute a manual git pull
to approve the key. You'll only need to do this once.
The docker ps
command shows the names of running containers, in our example bold_chandrasekhar. After entering the docker shell, change to any source repo and execute a git pull
.
scsosna@mymachine repos % docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a77a49d1d675 opengrok "/scripts/start.py" 58 seconds ago Up 56 seconds 0.0.0.0:8080->8080/tcp bold_chandrasekhar
scsosna@mymachine repos % docker exec -it bold_chandrasekhar bash
root@a77a49d1d675:/usr/local/tomcat# cd /opengrok/src
root@a77a49d1d675:/opengrok/src# cd user-service
root@a77a49d1d675:/opengrok/src/user-service# git pull
The authenticity of host 'github.com (140.82.112.3)' can't be established.
ED25519 key fingerprint is SHA256:+DiY3wvvV6TuJJhbpZisF/zLDA0zPMSvHdkr4UvCOqU.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
Already up to date.
root@a77a49d1d675:/opengrok/src/user-service# exit
scsosna@mymachine repos %
Restart Container
Since OpenGrok indexes every hour, you can wait an hour for the next re-indexing or restart the container to force an immediate re-index.
scsosna@mymachine repos % docker restart bold_chandrasekhar
bold_chandrasekhar
scsosna@mymachine repos %
Browse OpenGrok
Navigate to http://localhost:8080 to access OpenGrok:
Select the project(s) to include in your search:
Enter a search and review the results:
Conclusions
I'm a long-time advocate of OpenGrok but hadn't used it recently, so was pleasantly surprised to see it so easy to set up in a local environment using Docker.
Using it locally in this manner has definitely helped productivity for the reasons described: I'm only interested in a subset of the enterprise code base.
I have noticed that indices appear to be corrupted over time, requiring starting a new image, but that effort is fairly minor.
Opinions expressed by DZone contributors are their own.
Comments