Nowadays, security teams need information and context about data to keep it safe from the red teamers (who can discover attack paths). The data can be protected if we have visibility over it. As the number of services and applications in modern IT environments continues to grow, it has become increasingly difficult for resource providers to manage and locate them all. That’s where service discovery comes in.
Service discovery is used to identify, explore and analyze services of resource providers. In recent years, there has been a growing demand for tools that can help individuals and organizations to discover and visualize services and resources in a graphical context. One tool that can be particularly useful for this purpose is cartography, an OpenSource Python tool that helps to enable a broad set of visual exploration of the services.
This blog details the use of Cartography tool for service discovery, its data security advantages and dependency on the Neo4j database. Cartography is good at exposing the hidden dependencies between the service’s assets that helps to validate assumptions about security risks by understanding those dependencies.
The tool is intended to assist enterprises in understanding, exploring and tracking the relationships between entities in their ecosystem. In contrast to other security tools, cartography is sufficiently general and expandable to assist anyone and can provide better understanding of their exposure to risk, regardless of the platforms they employ.
What is Cartography?
Cartography consolidates infrastructure assets and their relationships in an intuitive graph view powered by a Neo4j database. It can work on both Linux and Windows servers, but it has been tested on Linux servers as of now.
What is Neo4j?
Neo4j is a popular graph database, and it’s written in Java language. It provides Cypher Query language (CQL), which is used to represent graphs visually. It stores data and displays it in the form of a graph, and it doesn’t use tables, columns and rows to store data and presents data in a tabular fashion.
Why Graph Database?
The standard databases will focus only on capturing individual data points, whereas the graph database consists of two entries: nodes and relationships. The nodes are the individual data points, and the relationships describe how those data points are related to each other.
Cartography Setup
- Install the Neo4j graph database on your server by downloading Neo4j Community Edition
- Configure the data sources for resource providers like AWS, Azure or GCP services
- Run Cartography using ‘pip install cartography’ to install the code
- Sync your data by issuing the below command
cartography --neo4j-uri bolt://localhost:7687>
Once Cartography and neo4j get installed and synced with the data, we can view the Neo4j web interface at http://localhost:7474.
Cartography Workflow
Cartography syncs the infrastructure assets of the resource provider and pushes the data to Neo4j Database.
There are a few steps for Cartography Sync – Get, Transform, Load, Cleanup
- The GET function will retrieve the necessary data from a resource provider API.
- The TRANSFORM function will help to convert the data to make it easier to ingest into the graph.
- The LOAD function pushes the processed data to Neo4j.
- After adding the nodes and relationships to the graph, if the latest data needs to be resent to the database, then old nodes and relationships which no longer exist will be deleted as part of the CLEANUP function.
Sample Neo4j graph database
Source: https://github.com/lyft/cartography/blob/master/docs/root/images/accountsandrds.png
Supported platforms and Services
- Amazon Web Services – API Gateway, EC2, Config, Elasticsearch, Elastic Kubernetes Service (EKS), IAM, Inspector, Lambda, KMS, RDS, Route53, Redshift, DynamoDB and S3
- Microsoft Azure – CosmosDB, Virtual Machine, Storage, SQL
- Google Cloud Platform – Cloud Resource Manager, Compute, Google Kubernetes Engine, Storage and DNS
- Google Workspace – Groups, users
- Oracle Cloud Infrastructure – IAM
- Okta – users, groups, roles, applications and organizations
- Kubernetes – Service, Pod, Cluster, Namespace, Container
Benefits of Cartography
- Cost optimization of the services – Post the Cartography sync, one can review the Neo4j services and handle the issue of unused services by deleting those that are no longer needed, helps saving money on those services' costs.
- Improved security – Data transparency reveals vulnerability, allowing for the regaining of control.
- Dependency identification – Provides dependency identification of the services and their properties that can be seen in the database, so cloud providers can better understand the potential impact of service disruptions on other services
- Consolidation of all services – All the services from the cloud providers can be seen in a single place, so cloud users can more easily monitor their resources. This can save the time and effort compared to managing the multiple services in different places.
- Visualization of data in multiple forms – All the node data can be seen not only through graphs but also in different forms like table, text, and code. Tables can help to provide a structured overview of data, while text can help to provide context and explanations. Code can help to make data analysis more transparent and reproducible. By using data in multiple forms, it is possible to provide a more complete and full understanding of data insights.
- Export the data – The node data and its relationships can be downloaded in different forms like JSON (Javascript Object Notation), CSV (Comma Separated Values), PNG (Portable Network Graphics) and SVG (Scalable Vector Graphics). With data in different formats, users can choose the format that best suits their needs and the work with data in a way that is most effective for them. JSON and CSV are good choices for developers and data analysts, while PNG and SVG are better for sharing the data with others and visualizing it.