|Linux Definitions, Concepts and Key Ideas
(adapted from various books and from white papers on various web sites)
What is an operating system? What is a kernel?
An operating system is made up of software instructions that lie between the computer hardware and the application software. At the center is the kernel, which functions to control processes, handles memory management, and manages software and hardware communications. Besides the kernel, an operating system also provides other basic services such as file systems, device drivers, user interfaces and system services.
What is UNIX?
UNIX was originally born in 1969 at Bell Labs, and grew and evolved through the years. In the early 1980s, AT&T began to market UNIX, and also distributed it free to universities. UNIX began to be used more and more throughout the world. As multiple ports of UNIX started appearing, AT&T standardized what the different ports had to be able to do to still be called UNIX. To that end, compliance with the Portable Operating System Interface for UNIX (POSIX) and the AT&T UNIX System V Interface Definition (SVID) defined whether an operating system was UNIX or not. Some of the major operating systems that are POSIX compliant, and thus UNIX, are Solaris, Linux, AIX, and HP-UX.
What is Linux?
Linux is a free Unix-type operating system originally created by Linus Torvalds in August of 1991 when he was a student at the University of Helsinki. He wrote it partly from scratch and partly by using publicly available software. He then released it to the internet and asked others to work with it, fix it and enhance it. Developed under the GNU General Public License, the source code for Linux is freely available to everyone. Although much of the code for Linux started from scratch, the blueprint for what the code would do was created to follow POSIX standards.
What is open source?
In general, ‘open source’ refers to any program whose source code is made freely available for use or modification. Open source software is usually developed as a public collaboration project and is made freely available, as are patches and fixes.
The Open Source community is a loose collection of highly capable and dedicated individuals who donate their time and energy to developing and maintaining software made publicly available. The major pieces of software produced and supported by this community is of very high quality. Some well-known examples of open source software are Linux, Sendmail, Apache and DNS.
What is a distribution? What is a boxed version?
When Linus Torvalds first developed Linux, the operating system basically consisted of his kernel and some GNU tools. With the help of other developers, Linus added more and more tools and applications.
With time, individuals, university students and companies began distributing Linux with their own choice of packages bound around Linus' kernel. This is where the concept of the "distribution" was born. The object of a distribution is to make the hundreds of unrelated software packages that make up Linux work together as a cohesive whole.
A distribution is sometimes also called a ‘boxed version’. Since Linux itself is free, what you buy is the CD, the collected set of packages, printed documentation and usually some type of support (a phone number to call to help with installation, for example).
You can buy a boxed version of Linux from companies such as Red Hat, SuSE, Caldera, MandrakeSoft and many others. You can also download Linux for free from any number of companies and individuals. There are distributions of all types and for practically any kind of computing endeavor.
What is Clustering?
Clustering is most widely recognized as the ability to combine multiple systems in such a way that they provide services a single system could not. Clustering is used to achieve higher availability, scalability and easier management. Higher Availability can be achieved by use of "failover" clusters, in which resources can automatically move between 2 or more nodes in the event of a failure. Scalability can be achieved by balancing the load of an application across several computer systems. Simpler management can be attained through the use of virtual servers, as opposed to managing each individual computer system.
What are High Availability Clusters?
High availability clustering joins together two or more servers to help ensure against system failures including planned shutdowns (e.g., maintenance, backups) and unplanned outages (e.g., system failure, software failure, operator errors). The group of connected systems is known as a cluster.
What is a Beowulf or High Performance Computing Clusters?
Beowulf or High Performance Computing Cluster (HPCC) combines multiple Symmetric Multi-Processor (SMP) computer systems together with high-speed interconnects to achieve the raw-computing power of classic "big-iron" supercomputers. These clusters work in tandem to complete a single request by dividing the work among the server nodes, reassemble the results and present them to the client as if a single-system did the work.
What is IP Load Balancing?
Server farms that can distribute requests to the same application among multiple independent servers are referred to as "load balancing". The term load balancing applies to clusters that include some number of nodes processing requests for the same type of application, often web servers, streaming media servers, terminal servers, or read only FTP and file servers. There are software-based mechanisms to perform load balancing, as well as appliance-based mechanisms.
What are Scalable Clusters?
Scalable Clusters provide the freedom of adding compute nodes in a cluster in order to increase the joint resources of processing. This can add to the power of computation since processors within a cluster can communicate data more efficiently and hence it also can reduce the average memory access time. This is particularly attractive when running parallel applications.
What is Oracle Real Application Clusters (RAC)?
Oracle RAC is a parallel database clustering technology from Oracle that runs under Linux and is an option of Oracle9i Enterprise Edition. RAC enables IT organizations to acquire key competitive advantages from a technical and business standpoint:
Scalability: RAC is an active-active cluster with shared storage, whereby multiple servers can work in parallel on the same set of data. Through a cluster interconnect technology called CacheFusion, up to 8 database server nodes are connected over a private Gigabit1 interconnect, and are sharing database storage, using SCSI or Fiber Channel technology. This allows the customer to add nodes as growth requirements increase. The maximum configuration supported by Dell is eight PowerEdgeTM 8450 servers (8 CPU, 16GB of memory per server), which combined provide the capacity of a 64-CPU database system and up to 128MB (16GB per server) of shared memory across the server nodes.
Availability: RAC, in addition to providing performance and scaling beyond a single server, delivers fault tolerance and helps deliver maximum uptime. Dell, Oracle and EMC have worked together on making the cluster fully redundant so that the database can remain up and running in the event of a component failure in the cluster (server, switch, disk, interconnect, etc).
What is the Linux Virtual Server?
The Linux Virtual Server is a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system. The architecture of the cluster is transparent to end users. End users only see a single virtual server. The Linux Virtual Server can be used to build highly scalable and high available network services, such as a scalable web, mail or media service. The real servers may be interconnected by high-speed LAN or by geographically dispersed WAN. The front-end of the real servers is a load balancer, which schedules requests to the different servers and make parallel services of the cluster to appear as a virtual service on a single IP address. Scalability is achieved by transparently adding or removing a node in the cluster. High availability is provided by detecting node or daemon failures and reconfiguring the system appropriately.