Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Building a Career Recommendation Engine With Neo4j

DZone's Guide to

Building a Career Recommendation Engine With Neo4j

It's simple to build a career recommendation engine with Neo4j. Learn how to do so based on what technology a developer knows, how advanced they are, and where they live.

· AI Zone ·
Free Resource

Bias comes in a variety of forms, all of them potentially damaging to the efficacy of your ML algorithm. Read how Alegion's Chief Data Scientist discusses the source of most headlines about AI failures here.

Neo4j is a graph database management system developed by Neo4j. Described by its developers as an ACID-compliant transactional database with native graph storage and processing, Neo4j is the most popular graph database according to DB-Engines rankings. As described before, a graph database is a good solution when you require a relationship with direction, i.e. a recommendation system that takes into consideration the fact that just because you know a famous person that does not mean this person knows you. Beyond the accord direction, each interrelationship has a property that makes this relationship more profound than it would be with a relational database. This article will give a simple example of a recommendation engine with Neo4j.

Install Neo4j With Docker

Install Docker. Run the Docker command:

docker run --publish=7474:7474 --publish=7687:7687 --volume=$HOME/neo4j/data:/data neo4j

Configure Neo4j at http://localhost:7474.

Creating a Career Recommendation

This application makes a career recommendation; given a developer who knows technology at a certain and lives in a certain city, the application returns:

  • Developers of the technology.

  • Developers from the town.

  • Developers of a municipality who know the technique and have a given level of knowledge.

In a relational database, the developer will start with the normalization process so that they can know more than one technology. The same concept happens between city and developer, resulting in a bunch of N to N relationships — even in an uncomplicated recommendation.

In the graph, the model will be easier than with a database once given a developer who works with the technology at a certain level (that will be an edge property). Also, this developer lives in a certain city.

Image title

The Dependencies

As usual, the minimum requirement is any Java EE 8 server and Java 8. Beyond this, there is Eclipse JNoSQL mapping, Apache TinkerPop, and the Neo4j driver.

<dependencies>
    <dependency>
        <groupId>org.jnosql.artemis</groupId>
        <artifactId>graph-extension</artifactId>
        <version>0.0.4</version>
    </dependency>
    <dependency>
        <groupId>org.jnosql.artemis</groupId>
        <artifactId>artemis-configuration</artifactId>
        <version>0.0.4</version>
    </dependency>
    <dependency>
        <groupId>org.apache.tinkerpop</groupId>
        <artifactId>gremlin-core</artifactId>
        <version>${tinkerpop.version}</version>
    </dependency>
    <dependency>
        <groupId>com.steelbridgelabs.oss</groupId>
        <artifactId>neo4j-gremlin-bolt</artifactId>
        <version>0.2.25</version>
    </dependency>
    <dependency>
        <groupId>org.neo4j.driver</groupId>
        <artifactId>neo4j-java-driver</artifactId>
        <version>1.4.3</version>
    </dependency>
    <dependency>
        <groupId>javax</groupId>
        <artifactId>javaee-api</artifactId>
        <version>8.0</version>
        <type>jar</type>
        <scope>provided</scope>
    </dependency>
</dependencies>  

Eclipse JNoSQL has an integration with Apache Tinkerpop. This is the first step in making an Apache TinkerPop graph available in the CDI container.

public interface GraphSupplier extends Supplier<Graph> {

}
@ApplicationScoped
public class GraphProducer {

    @Inject
    private GraphSupplier graphSupplier;

    @Produces
    @RequestScoped
    public Graph getGraph() {
        return graphSupplier.get();
    }

    public void dispose(@Disposes Graph graph) throws Exception {
        graph.close();
    }
}

The default supplier implementation injects the driver.

@ApplicationScoped
public class DefaultGraphSupplier implements GraphSupplier {

    private static final Neo4JElementIdProvider<?> VERTEX_ID_PROVIDER = new Neo4JNativeElementIdProvider();
    private static final Neo4JElementIdProvider<?> EDGE_PROVIDER = new Neo4JNativeElementIdProvider();


    @Inject
    @ConfigurationUnit
    private Instance<Driver> driver;

    @Override
    public Graph get() {
        Neo4JGraph graph = new Neo4JGraph(driver.get(), VERTEX_ID_PROVIDER, EDGE_PROVIDER);
        graph.setProfilerEnabled(true);
        return graph;
    }

}

META-INF/jnosql.json has a file with the Neo4j configuration.

[
  {
    "description": "The Neo4J configuration",
    "name": "name",
    "settings": {
      "url": "bolt://localhost:7687",
      "admin": "neo4j",
      "password": "admin"
    }
  }
]

Modeling

To this clean sample, it will need a sample name and a URL-friendly name. To keep this behavior, there's a Name type.

import java.util.Locale;
import java.util.Objects;
import java.util.function.Supplier;

import static java.text.Normalizer.Form.NFD;
import static java.text.Normalizer.normalize;
import static java.util.Objects.requireNonNull;

public final class Name implements Supplier<String> {

    private final String value;

    private Name(String value) {
        requireNonNull(value, "value is required");
        this.value = normalize(value.toLowerCase(Locale.US).replace(" ", "_"), NFD);
    }


    @Override
    public String get() {
        return value;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (!(o instanceof Name)) {
            return false;
        }
        Name name = (Name) o;
        return Objects.equals(value, name.value);
    }

    @Override
    public int hashCode() {
        return Objects.hashCode(value);
    }

    @Override
    public String toString() {
        return value;
    }

    public static Name of(String name) {
        return new Name(name);
    }
}
@Entity(value = "BUDDY")
public class Buddy implements Serializable {

    @Id
    private Long id;

    @Column
    @Convert(NameConverter.class)
    private Name name;

    @Column
    private String displayName;

    @Column
    private Double salary;

    //...

}
@Entity(value = "CITY")
public class City implements Serializable {

    @Id
    private Long id;

    @Column
    @Convert(NameConverter.class)
    private Name name;

    @Column
    private String displayName;
 //... 
}
@Entity(value = "TECHNOLOGY")
public class Technology implements Serializable {

    @Id
    private Long id;

    @Column
    @Convert(NameConverter.class)
    private Name name;

    @Column
    private String displayName;

 //...

}

In JPA, there is a converter for Name to String with the implementation of AttributeConverter.

public class NameConverter implements AttributeConverter<Name, String> {

    @Override
    public String convertToDatabaseColumn(Name attribute) {
        if (attribute == null) {
            return null;
        }
        return attribute.get();
    }

    @Override
    public Name convertToEntityAttribute(String dbData) {

        if (dbData == null) {
            return null;
        }
        return Name.of(dbData);
    }
}

Repository

There is a repository in the Artemis; we just need to implement the Repository interface. There is a powerful resource called a method query; at this point, we only find and delete from the name.

public interface TechnologyRepository extends Repository<Technology, Long> {

    Optional<Technology> findByName(String name);

    void deleteByName(String buddyName);
}
public interface CityRepository extends Repository<City, Long> {

    Optional<City> findByName(String name);

    void deleteByName(String buddyName);
}
public interface BuddyRepository extends Repository<Buddy, Long> {


    Optional<Buddy> findByName(String name);

    void deleteByName(String buddyName);
}

There is graph traversal that allows a complex query among both vertex and edge. To make that relationship, there is a service to a buddy.

@ApplicationScoped
public class BuddyService {

    @Inject
    private GraphTemplate graphTemplate;

    public List<Buddy> findByTechnology(String technology) throws NullPointerException {
        requireNonNull(technology, "technology is required");

        Stream<Buddy> buddies = graphTemplate.getTraversalVertex()
                .hasLabel(Technology.class)
                .has("name", technology)
                .in(Edges.WORKS).orderBy("name").asc().stream();

        return buddies.collect(Collectors.toList());
    }

    public List<Buddy> findByTechnology(String technology, TechnologyLevel level) throws NullPointerException {
        requireNonNull(technology, "technology is required");
        requireNonNull(level, "level is required");

        Stream<Buddy> buddies = graphTemplate.getTraversalVertex()
                .hasLabel(Technology.class)
                .has("name", technology)
                .inE(Edges.WORKS).has(TechnologyLevel.EDGE_PROPERTY, level.get())
                .outV().orderBy("name").asc().stream();

        return buddies.collect(Collectors.toList());
    }

    public List<Buddy> findByCity(String city) throws NullPointerException {
        requireNonNull(city, "city is required");

        Stream<Buddy> buddies = graphTemplate.getTraversalVertex()
                .hasLabel(City.class)
                .has("name", city)
                .in(Edges.LIVES)
                .orderBy("name").asc().stream();

        return buddies.collect(Collectors.toList());
    }

    public List<Buddy> findByTechnologyAndCity(String technology, String city) throws NullPointerException {
        requireNonNull(technology, "technology is required");
        requireNonNull(city, "city is required");

        Stream<Buddy> buddies = graphTemplate.getTraversalVertex()
                .hasLabel(Technology.class)
                .has("name", Name.of(technology).get())
                .in(Edges.WORKS)
                .filter(b -> graphTemplate.getEdges(b, Direction.OUT, Edges.LIVES).stream()
                            .<City>map(EdgeEntity::getInbound)
                            .anyMatch(c -> c.equals(city))

                    ).orderBy("name").asc().stream();

        return buddies.collect(Collectors.toList());
    }

    public void live(Buddy buddy, City city) throws NullPointerException{
        requireNonNull(buddy, "buddy is required");
        requireNonNull(city, "city is required");
        graphTemplate.edge(buddy, Edges.LIVES, city);
    }

    public void work(Buddy buddy, Technology technology) {
        requireNonNull(buddy, "buddy is required");
        requireNonNull(technology, "technology is required");

        graphTemplate.edge(buddy, Edges.WORKS,technology);
    }

    public void work(Buddy buddy, Technology technology, TechnologyLevel level) {
        requireNonNull(buddy, "buddy is required");
        requireNonNull(technology, "technology is required");
        requireNonNull(level, "level is required");

        EdgeEntity edge = graphTemplate.edge(buddy, Edges.WORKS, technology);
        edge.add(TechnologyLevel.EDGE_PROPERTY, level.get());
    }
}

Resource

As the last step in the application, expose this service as a REST API.

@ApplicationScoped
@Path("cities")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
@Transactional
public class CityResource {


    @Inject
    @Database(GRAPH)
    private CityRepository cityRepository;


    @POST
    public void insert(@Name String name) {

        cityRepository.findByName(name).ifPresent(b -> {
            throw new WebApplicationException("There is city that already does exist", Response.Status.BAD_REQUEST);
        });

        cityRepository.save(new City(name));
    }

    @GET
    @Path("{name}")
    public CityDTO get(@PathParam("name")String name) {
        City city = cityRepository.findByName(name)
                .orElseThrow(() -> new WebApplicationException("city does not found", Response.Status.NOT_FOUND));

        return new CityDTO(city);
    }


    @DELETE
    @Path("{name}")
    public void delete(@PathParam("name") @Name String buddyName) {
        cityRepository.deleteByName(buddyName);
    }
}
@ApplicationScoped
@Path("technologies")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
@Transactional
public class TechnologyResource {


    @Inject
    @Database(GRAPH)
    private TechnologyRepository cityRepository;


    @POST
    public void insert(@Name String name) {

        cityRepository.findByName(name).ifPresent(b -> {
            throw new WebApplicationException("There is a technology that already does exist", Response.Status.BAD_REQUEST);
        });

        cityRepository.save(new Technology(name));
    }

    @GET
    @Path("{name}")
    public TechnologyDTO get(@PathParam("name")String name) {
        Technology technology = cityRepository.findByName(name)
                .orElseThrow(() -> new WebApplicationException("technology does not found", Response.Status.NOT_FOUND));

        return new TechnologyDTO(technology);
    }


    @DELETE
    @Path("{name}")
    public void delete(@PathParam("name") @Name String buddyName) {
        cityRepository.deleteByName(buddyName);
    }
}
@ApplicationScoped
@Path("buddies")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
@Transactional
public class BuddyResource {


    @Inject
    @Database(GRAPH)
    private BuddyRepository buddyRepository;

    @Inject
    @Database(GRAPH)
    private CityRepository cityRepository;

    @Inject
    @Database(GRAPH)
    private TechnologyRepository technologyRepository;

    @Inject
    private BuddyService service;


    @POST
    public void insert(@Valid BuddyDTO buddy) {

        buddyRepository.findByName(buddy.getName()).ifPresent(b -> {
            throw new WebApplicationException("There is a buddy that already does exist", Response.Status.BAD_REQUEST);
        });

        buddyRepository.save(buddy.toEnity());
    }

    @GET
    @Path("{buddy}")
    public BuddyDTO get(@PathParam("buddy") @Name String buddyName) {
        Buddy buddy = buddyRepository.findByName(buddyName)
                .orElseThrow(() -> new WebApplicationException("buddy does not found", Response.Status.NOT_FOUND));
       return BuddyDTO.of(buddy);
    }

    @GET
    @Path("cities/{city}")
    public List<BuddyDTO> getCities(@PathParam("city") @Name String city) {
        return service.findByCity(city).stream().map(BuddyDTO::of).collect(toList());
    }

    @GET
    @Path("technologies/{technology}")
    public List<BuddyDTO> getTechnologies(@PathParam("technology") @Name String technology) {
        return service.findByTechnology(technology).stream().map(BuddyDTO::of).collect(toList());
    }

    @GET
    @Path("technologies/{technology}/{level}")
    public List<BuddyDTO> getTechnologiesLevel(@PathParam("technology") @Name String technology,
                                               @PathParam("level") String level) {

        return service.findByTechnology(technology, TechnologyLevel.parse(level)).stream().map(BuddyDTO::of).collect(toList());
    }

    @GET
    @Path("cities/{city}/technologies/{technology}")
    public List<BuddyDTO> getCitiesTechnologies(@PathParam("city") @Name String city,
                                                @PathParam("technology") @Name String technology) {

        return service.findByTechnologyAndCity(technology, city).stream().map(BuddyDTO::of).collect(toList());
    }

    @PUT
    @Path("{buddy}")
    public void update(@PathParam("buddy") @Name String buddyName, @Valid BuddyDTO dto) {
        Buddy buddy = buddyRepository.findByName(buddyName)
                .orElseThrow(() -> new WebApplicationException("buddy does not found", Response.Status.NOT_FOUND));

        buddy.setSalary(dto.getSalary());
        buddyRepository.save(buddy);
    }

    @DELETE
    @Path("{buddy}")
    public void delete(@PathParam("buddy") @Name String buddyName) {
        buddyRepository.deleteByName(buddyName);
    }


    @PUT
    @Path("{buddy}/lives/{city}")
    public void lives(@PathParam("buddy") @Name String buddyName, @PathParam("city") @Name String cityName) {

        Buddy buddy = buddyRepository.findByName(buddyName)
                .orElseThrow(() -> new WebApplicationException("buddy does not found", Response.Status.NOT_FOUND));

        City city = cityRepository.findByName(cityName)
                .orElseThrow(() -> new WebApplicationException("city does not found", Response.Status.NOT_FOUND));


        service.live(buddy, city);
    }

    @PUT
    @Path("{buddy}/works/{technology}")
    public void works(@PathParam("buddy") @Name String buddyName, @PathParam("technology") @Name String technologyName) {

        Buddy buddy = buddyRepository.findByName(buddyName)
                .orElseThrow(() -> new WebApplicationException("buddy does not found", Response.Status.NOT_FOUND));

        Technology technology = technologyRepository.findByName(technologyName)
                .orElseThrow(() -> new WebApplicationException("city does not found", Response.Status.NOT_FOUND));

        service.work(buddy, technology);
    }

    @PUT
    @Path("{buddy}/works/{technology}/{level}")
    public void worksLevel(@PathParam("buddy") @Name String buddyName,
                           @PathParam("technology") @Name String technologyName,
                           @PathParam("level") String level) {

        Buddy buddy = buddyRepository.findByName(buddyName)
                .orElseThrow(() -> new WebApplicationException("buddy does not found", Response.Status.NOT_FOUND));

        Technology technology = technologyRepository.findByName(technologyName)
                .orElseThrow(() -> new WebApplicationException("city does not found", Response.Status.NOT_FOUND));


        service.work(buddy, technology, TechnologyLevel.parse(level));
    }
}

The new @Transactional annotation is a CDI interceptor that makes any resource operation transactional.

Time to Test

With the code ready and the system running on a Java EE 8 server, the next step is to run and test it.

#cities
curl -H "Content-Type: application/json" -X POST -d 'Santos' http://localhost:8080/careerbuddy/resource/cities/
curl -H "Content-Type: application/json" -X POST -d 'Salvador' http://localhost:8080/careerbuddy/resource/cities/
curl -H "Content-Type: application/json" -X POST -d 'Belo Horizonte' http://localhost:8080/careerbuddy/resource/cities/
curl -H "Content-Type: application/json" -X POST -d 'Rio de Janeiro' http://localhost:8080/careerbuddy/resource/cities/
curl -H "Content-Type: application/json" -X POST -d 'Curitiba' http://localhost:8080/careerbuddy/resource/cities/

#technologies
curl -H "Content-Type: application/json" -X POST -d 'Java' http://localhost:8080/careerbuddy/resource/technologies/
curl -H "Content-Type: application/json" -X POST -d 'NoSQL' http://localhost:8080/careerbuddy/resource/technologies/
curl -H "Content-Type: application/json" -X POST -d 'Cloud' http://localhost:8080/careerbuddy/resource/technologies/
curl -H "Content-Type: application/json" -X POST -d 'Container' http://localhost:8080/careerbuddy/resource/technologies/
curl -H "Content-Type: application/json" -X POST -d 'Golang' http://localhost:8080/careerbuddy/resource/technologies/

#buddies

curl -H "Content-Type: application/json" -X POST -d '{"name":"Jose","salary":3000.0}' http://localhost:8080/careerbuddy/resource/buddies/
curl -H "Content-Type: application/json" -X POST -d '{"name":"Mario","salary":5000.0}' http://localhost:8080/careerbuddy/resource/buddies/
curl -H "Content-Type: application/json" -X POST -d '{"name":"Joao","salary":9000.0}' http://localhost:8080/careerbuddy/resource/buddies/
curl -H "Content-Type: application/json" -X POST -d '{"name":"Pedro","salary":14000.0}' http://localhost:8080/careerbuddy/resource/buddies/

#lives

curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/mario/lives/salvador
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/joao/lives/curitiba
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/pedro/lives/santos
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/jose/lives/santos

#works

curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/jose/works/java/advanced
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/jose/works/nosql/beginner
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/jose/works/cloud/intermediate
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/jose/works/container/advanced

curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/mario/works/golang/advanced
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/mario/works/nosql/advanced
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/mario/works/cloud/beginner
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/mario/works/container/beginner

curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/joao/works/java/intermediate
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/joao/works/cloud/advanced
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/joao/works/container/advanced
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/joao/works/golang/beginner

curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/pedro/works/golang/beginner
curl -H "Content-Type: application/json" -X PUT http://localhost:8080/careerbuddy/resource/buddies/pedro/works/container/advanced

Graph Result

Query result:

curl http://localhost:8080/careerbuddy/resource/buddies/technologies/java
curl http://localhost:8080/careerbuddy/resource/buddies/technologies/cloud
curl http://localhost:8080/careerbuddy/resource/buddies/technologies/java/advanced
curl http://localhost:8080/careerbuddy/resource/buddies/cities/salvador
curl http://localhost:8080/careerbuddy/resource/buddies/cities/santos/technologies/java

References

Your machine learning project needs enormous amounts of training data to get to a production-ready confidence level. Get a checklist approach to assembling the combination of technology, workforce and project management skills you’ll need to prepare your own training data.

Topics:
neo4j ,ai ,tutorial ,recommendation engine ,algorithm ,machine learning

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}