November 10, 2023

Basic Concepts of Computer Networks for Application/Server Admins

By Joel Silva

Something quite common in the routine of those who manage applications on Servers, such as the Alteryx Server, Tableau Server, Power BI Server, or even Snowflake Data Cloud, is solving problems that users occasionally face or at least carrying out troubleshooting.

In this blog, we will cover basic concepts of computer networks for server administrators. The goal is to demonstrate how machines and services behave on the network, how communication occurs, and techniques to discover whether the problem faced is a network problem or an application problem.

Throughout this blog, several analogies will be used to make the content easier to understand, which is more complex. The main objective here is to provide even more knowledge for experienced users and educate new users through a pleasant and simplified read.

Here are some of the problems frequently faced by server application administrators:

Users unable to access the application’s website
Users are unable to synchronize the local application with the server
When deploying a new application server, the URL does not respond

How do Computer Networks Work?

Computer networks are made up of interconnected devices that allow the exchange of information and resources. In a very simplified way, we can classify these devices into two categories: intermediate devices (switches, routers) and end devices – technically known as nodes (servers, computers, laptops, smartphones, etc.).

These interconnected devices send, receive, and exchange data, voice, and video traffic, thanks to the hardware and software that make up the environment.

Since we are talking about networks, you’re probably wondering: What is the Internet?

The Internet is a network of global connections that allows instant data sharing between devices. In other words, the Internet is nothing more than the connection between several global network providers, from the neighborhood provider to the highest provider level in a country or continent.

An analogy that we can use to exemplify this interconnection of networks is the road network. If you want to travel by car from New York to Nashville, you will travel along local streets, avenues, major highways, and roads until you reach your destination.

This navigation/movement is only possible because all stages of the route are connected at some point somehow.

On the Internet, it works exactly the same way. If you are in Brazil and want to access a website that is located in Japan, your communication will work perfectly because this “path” already exists in the global network (Internet).

In road traffic, we need to follow a series of laws and protocols (traffic laws, car license plates, licensing, etc.) With internet traffic, it is no different. In this case, these are the main elements:

IP Addresses
Protocols
Ports
Routers

The IP address is the identifier that allows information to be sent between devices on a network: it contains location information and makes the device accessible for communication. This element is essential for communication. With it, it is possible to connect to a network.

We can define network protocols as the set of rules that establish a common communication language on the network. Let’s look at another analogy to make it easier to understand what network protocols are.

Suppose a Brazilian and a Polish person want to establish a dialogue in person. In that case, it will be necessary to define the language used in that conversation since Brazilian Portuguese and Polish are very different languages.

This means that if each person wants to communicate in their native language, they probably won’t be able to understand each other. Hence the a need for a common language, such as English, for the conversation to happen. In simple terms, in our analogy, the English language would be equivalent to the network protocol. On the Internet, this universal protocol/set of rules for communication is known as TCP/IP.

A port is a virtual point where network connections begin and end. Ports are software-based and managed by the computer’s operating system. Each port is associated with a specific process or service. Ports allow computers to differentiate between different types of traffic easily: requests to access web pages go to a different port than an FTP, for example.

There is a standardization of ports on all devices connected to the network, with a specific number assigned to each one. Several ports are reserved for specific protocols, such as HTTP requests. All of them are sent to port 80. For a website with a digital certificate (SSL), the port is 443.

Port numbering is widely used in creating and maintaining firewall rules, for example. This is something we will quickly cover later on.

Routers, in turn, can be classified as intermediate devices which connect two or more networks. Among its functions is the ability to define the best route to the destination of each request (which, from now on, we will call a packet) and manage traffic between networks. Collectively, millions of interconnected routers are the key element for the existence of the Internet as we know it.

Switches and Routers are really similar, but we can quickly differentiate them in simple terms by the following statement: switches forward packets between nodes on the same network. In contrast, routers can forward data on different networks.

Now that we have a brief but good foundation, we can move on to services and protocols.

Main Network Protocols and Services

There are several homologated protocols and network services, but to feed our brief knowledge, let’s keep it simple. Let’s understand a little how 5 of them work:

HTTP: Hypertext Transfer Protocol
DNS: Domain Name Server
DHCP: Dynamic Host Configuration Protocol
FTP: File Transfer Protocol
NTP: Network Time Protocol

Studying/working with computer networks is definitely an adventure of acronyms and alphabet soup!

HTTP is a protocol that allows you to obtain resources, such as HTML documents. It is the basis of any data exchange on the Web and a client-server protocol (we will discuss this architecture later in this blog). This means that requests are initiated by the recipient, usually a Web browser such as Google Chrome.

A complete document is reconstructed from the different sub-documents obtained, such as text, layout description, images, videos, scripts, and much more.

This system is the basis of communication that exists throughout the Internet in which websites and content that contain hyperlinks can be found more easily by the public through a mouse click or a tap on the screen.

Currently, the secure version of HTTP is the most used: HTTPS. It is an evolution of HTTP. The difference between them is that HTTPS has device communication encrypted using an SSL certificate.

DNS is one of the most important protocols, in my humble opinion. This is because, for our human brain, it is much easier to remember names than numbers. For example, If I want to talk to my friend Marcus by phone, I just need to open my phone book, search for his name, and make the call.

Memorizing his phone number would be a relatively easy task, but imagine memorizing the phone numbers of your closest family and friends. It would be an impossible task. Some people find it very easy to memorize numbers, but they are exceptions.

As we learned in the previous topic, all devices need to have an IP address so they can communicate on the network. That said, when we want to access a web page, such as the phData website, we request an HTTP connection to the IP address of the web server that hosts the phData website.

Imagine how much work it would take to remember all the IP addresses of the websites you normally visit. That’s where our DNS hero appears. In simplistic and practical terms, it acts similarly to the phonebook in our analogy, resolving names to IP addresses and vice versa.

Therefore, when trying to access a web page, you are initiating an HTTP request, which needs the IP address of the destination website to be initiated. This means that in this access, the DNS acts first in a matter of milliseconds to translate the desired name to the IP address. Once it knows the IP address, this information is passed to HTTP, which then connects to the website and displays the page for you.

The importance of DHCP is directly linked to the ease of configuring network parameters when connecting to a network, whether new (first time) or already known and connected previously.

When connecting to a network, you necessarily need some basic settings to navigate, such as your IP address, the network mask, the router’s address to forward packets to other networks, and the DNS server, among other settings. This is where DHCP acts, giving you all these network parameters as soon as your device associates with it.

Very briefly, the FTP protocol allows authorized users to download and upload files from an FTP server, a computer that stores the data. In this way, FTP facilitates the transfer of information between different devices.

NTP is the protocol responsible for synchronizing the clocks of devices on a network, such as servers, workstations, routers, and other equipment, based on reliable time references.

We are making good progress! Let’s now understand how the client-server architecture works.

Client-Server Architecture

The client-server architecture is a distributed application architecture. That is, in the network, there are providers of resources or services to the network, which are called servers, and there are those requesting the resources or services, called clients.

The client does not share any of its resources with the server, but it requests some functions from the server, and the client is responsible for initiating communication with the server while it waits for incoming requests.

This architecture works analogously to the service of a restaurant. Customers enter the establishment and request food service from a waiter. The request comes from the customer, and the restaurant delivers what was requested.

What Happens Behind the Scenes?

For now, let’s take a break from theory and see these concepts in practice. Cisco provides a network simulator called Cisco Packet Tracer. I created a simple scenario that contains an Internal network and a connection to the Internet and activated simulation mode to demonstrate some of the situations described above.

With this practical demonstration (although in the simulator), we are now able to understand more about the OSI Model, which is the standard teaching model that illustrates the communication between computer systems and networks.

When typing the URL of a web page and trying to access it in the browser, we create a request that starts in the browser and goes down to the last layer, where the data will be transmitted to reach its destination. We can call the act of creating a request (normally made by the client node) Data Encapsulation.

When the packet arrives at the destination server, it is necessary to perform Data Decapsulation so that the application/service can understand, process, and respond to the request made by the client.

We can analogously compare the encapsulation process to sending a package by postal service. If I want to send a baseball ball to my friend Pedro, I must go to a post office agency. There, the ball will be packed in a box (which we can compare to the package), and all traffic information is added so that the origin and destination of the package can be identified.

This is exactly what happens in the lower layers of the OSI model: headers containing protocol information, application port, IP address, and physical address, among other information, are added so that the packet does not get lost on the network and arrives effectively and entirely at its destination.

It is from these headers that routers determine the best route for each packet.

What is a Firewall?

The last topic in this great theoretical foundation about computer networks is a question that all of us have asked at some point in our lives: after all, what is a firewall?

Cisco has the following definition for firewall, which I particularly like because it is succinct and assertive:

A firewall is a network security device that monitors incoming and outgoing network traffic and decides to allow or block specific traffic according to a defined set of security rules.

Rules are typically created based on the following elements: IP Addresses, Network Masks, Ports, and Protocol.

In a situation where the user is trying to access a website that responds not only to the internal network, such as Snowflake, it is important to check that no firewall/proxy rule is preventing this connection.

Troubleshooting Issues

Now, the most awaited moment: how to troubleshoot some of the most common problems:

Users unable to access the application’s website
Users are unable to synchronize the local application with the server
When developing a new application server, the URL does not respond

Number 1 is one of the most common types of tickets to receive. In most cases, the application website (such as Alteryx Gallery and Tableau Server) is restricted to internal networks only. This means that if the user is working from any network other than the internal office network, the website will not be accessible unless they are connected to the VPN.

These are typically the steps I take to try to identify the problem:

Check if the service is up and accessible by the server
Check if the name is resolvable
Remember the DNS concept? A very simple way to test if the DNS is able to resolve a specific name is to open the command prompt and type nslookup and the name you want to resolve. See the screenshot below:

In this case, it’s clear that the searched name phdata.io is resolvable and answers with an IPv4 address. This means DNS is working without any problems.

If there were no response from the DNS, it would be a great indication that the current DNS cannot resolve the name (which takes us back to the VPN issue). Some DNS servers only work internally, which means that to use them, you also need to be connected to the VPN.

Check if there is connectivity.
To test connectivity between a specific machine and a specific address, an excellent way is to use the ping command:

This command fires 4 packets to the specified destination. Remember that the destination can be an IP address or a name (as we did in the screenshot above). If we use a name, the first step will be to resolve the name (implicit DNS request), and then there will be an attempt to connect to the returned IP address.

Check if you are connected to the correct network.
If no positive response was generated in steps b and c, there is most likely a network problem establishing the connection. Again, make sure you are connected to the correct network, including VPN.

A last command that can also provide relevant information is tracert. This command returns the entire network path taken by the packet until it reaches the destination:

If the package does not reach its final destination, it will be possible to determine where it is going. This information will be useful in understanding which router the packet may be getting lost on.

PS: If you find asterisks in the last column instead of the node name/IP, this router is likely configured not to identify itself on the network. This is sometimes a security practice adopted by some network administrators.

It is worth remembering that all of these commands are cross-platform, meaning they work on different operating systems.

Also, remember the firewall concept we demonstrated here! After all, the connectivity problem may be caused by some firewall policy/rule restricting communication at the name/address/port level.

Closing

This blog contains a considerable amount of network concepts that need to be more complex to understand and can be studied much more deeply. However, this tip of the iceberg demonstrated here can greatly help to solve everyday problems. Suppose the problem is outside the application, in that.

In that case, it is important to demonstrate and highlight to the network team the steps that have already been carried out and which indicate a possible problem in the network. This helps our work as application administrators, helps the team that manages the network, helps the user who is facing the problem, and helps the company’s productivity.

As we all know, time is money, and we can make solving a problem much more efficient by knowing the concepts and techniques demonstrated here. And, of course, a good relationship with everyone involved in the situation makes our work easier.

If you have any additional questions on how to troubleshoot issues on application servers, reach out to our team of experts!

More to explore

Top 5 Fivetran Connectors for Healthcare

Loc Dao April 29, 2024

How to Migrate Hive Tables From Hadoop Environment to Snowflake Using Spark Job

Rajib Prasad April 26, 2024

Beyond the Data: Franco Borgiani, Data Engineer

Izzy OKonek April 25, 2024