Pelendur: Steward of the Sysadmin
| Matt Curtin1 |
Sandy Farrar |
Tami King |
| The Ohio State University |
| Department of Computer and Information
Science |
| {cmcurtin,farrar,tami}@cis.ohio-state.edu |
November 27, 2000
This paper also available in PDF and Postscript.
Abstract:
Here we describe Pelendur, a system for the management of
common system operation tasks. Specifically, Pelendur
focuses on the management of user accounts and related information
(such as groups) across platforms and even for particular software
packages (like databases) that require user authentication.
Pelendur has reduced the massive process of deleting
expired accounts and creating new accounts between terms from a
week-long operation by several part-time operators (with subsequent
cleanup by a collabortion of instructors and staff) into a
completely automated process that requires less than 15 minutes of
staff work and completely eliminates the need for instructor
intervention.
1. Introduction
In 1998, our department was in the midst of a massive migration of our
computing facilities wherein we moved from an architecture of many
HP-UX clusters to an architecture using Solaris-based
function-specific servers with thin clients in offices and labs. Some
of the software used in the old environment needed to be
replaced [3]. Because of
severe limitations in the functionality and correctness of the largely
ad-hoc scripts run by operators for the creation of accounts, it was
determined that an account management system capable of managing our
evolving multiplatform environment was needed.
1.1 Yet Another Account Management System?
The idea of implementing a software system for the management of user
accounts is not new; past years' LISA conferences have seen many such
systems. Even if we briefly ignore the issue of availability, some
systems described were unsuitable because of extreme differences in
the way that accounts are created and managed [1,5]
and incompatible means of handling account
data [7,11,9,10,4,8].
In the end, the most compelling reason for us to build our own
software was that a grander vision existed: a single data repository
for our department, which would include such things as data needed for
user accounts, course-specific computing requirements, and access to
various limited-access department resources. We could not find any
available account management system that would work easily with any
sort of database that we would construct.
1.2 The Academic Environment
Account management, though a fairly straightforward task, is quite
demanding in an academic environment. Each term, we receive course
rosters from the university registrar. Although students who major in
computer and information science have ``permanent'' accounts (those
that will remain until after they graduate or change majors), we have
thousands of other accounts that are created specifically for the
duration of the term. We have one week between most terms, which
means that during this time, we need to delete potentially more than
2,500 accounts and then create another 3,000 accounts for the next
term.
In this paper, and in our environment, we use a term that will be
important to understand: ``section''. This is a specific ``class''
that meets together. This term is introduced because many sections of
a given class can be scheduled for the same term and we need to know
the most granular level of grouping available from the registrar.
1.3 System Requirements
Specific requirements for this system were identified. In its full
design, our scope is actually much more broad than the rather specific
task of account management. The reason for this is that parts of
account management (such as determining how much quota to associate
with an account) are dependent on other criteria like the requirements
for courses that the user of that account is scheduled to take. For
example, a course that deals with particularly large data sets might
have a requirement for more than the default amount of disk quota.
Initial requirements focused on the actual management of user accounts
and dependencies.
- Account Management
- This is the management of individual account
profiles. Adding, editing, expiring, and purging them.
- Course Management
- Courses have particular requirements (such as
the need for one platform or another and the use of a software
system like Sybase) that need to be configured and
managed. Some of these configuration options are simple matters of
preference. Others are matters of policy, which the course
coordinator does not have the authority to change. (Though we refer
specifically to courses, there is nothing that prevents these
entities to be managed from being project groups, departments, or
any other sort of group that might have particular requirements in
other environments.)
- Resource Management
- Anything that exists in our environment
(such as ``Unix machines'', ``NT machines'', ``Sybase database'',
``disk quota'', ``print quota'', and ``color printers'') might also
need to be managed. As these are neither accounts nor courses, but
share basic properties, we classify all of these as ``resources''.
The basic point here is that users who are administratively
responsible for these reousrces mangage them through
Pelendur instead of having us perform all of the
management tasks for them.
Additional requirements were slated for future development. Some of
these features are now implemented and others are still on our
``to-do'' list.
- Unix groups
- As we use groups to manage sets of users on the
Unix systems, Pelendur should be aware of groups and know
how to use them. (This interface is now partially implemented; new
groups are created, but old groups are not garbage collected.)
- Mailing lists
- Some of our course instructors and students
prefer to use mailing lists to stay in touch. We presently use both
faculty-maintained aliases files and
Majordomo [2], but are now investigating
the possibility of replacing both of these mechanisms with
Mailman [12].
- Course directories
- Because some courses have group projects or
software that is specific to the course, we need to be able to
associate directories with a given course. Handling of filesystem
permissions should enforce the policy for read and write access
established by the instructor.
- Multidomain management
- Currently, everything that is a part of
the instructional environment is considered ``the system''.
Accounts that belong to individual research labs are not managed by
Pelendur but it could be convenient for us to have that
option available. Should we do this in the future, it would be nice
to have the option of having a single Pelendur
installation be able to manage multiple ``systems'', rather than
having to make a new installation of the software for each domain
that needs to be managed. This feature would also be useful to
include access to limited-access machines, such as those that are
set aside for long-term computationally-intensive jobs.
- Electronic lock systems
- A relatively new addition to our
department is an electronic lock system, whereby ID cards are used
for access control instead of physical keys. Presently, this is
managed by a standalone DOS-based system.
- Course newsgroups
- Most courses in our environment is assigned a
newsgroup on our local news server. This is the default means of
providing an ``out of class'' communication channel. Presently, we
just create new groups as new courses are added and cancel the
messages in the groups at the end of each term by hand. Though this
is not a big time-sink, we would like for these processes to be
automated.
- Course-specific environment
- Some courses have particular
environmental needs, such as a course-specific $PATH
setting, for example. Pelendur can provide this.
- ``Any computing resource''
- As we continue to move forward,
other resources are identified and incorporated into the system's
functionality. At a very high level, the goal of the system is to
manage the systems' configurations so that system administrators can
do other things that computers can't do very well, like planning.
2. Design and Implementation
Pelendur is a large system, made up of several programs.
We'll first describe the philosophies that influenced the system's
design and then consider the programs and major library modules that
these programs use.
2.1 Data-Driven
The entire system sits atop a Sybase relational database.
Rather than creating code that would depend upon very specific data,
working with entities that make sense for our environment, we opted to
put as much of the system in data as possible and to make those data
be as generic and flexible as possible. Thus, rather than dealing
with courses, sections, and instructors, other users of
Pelendur will be able to work with teams, departments, and
project coordinators. Each deployment of Pelendur will
define its own terms and the relationships among them that make sense.
We believe it important to emphasize that this makes the integrity of
the database especially critical. In such a highly dynamic system,
we're not dealing with simple cases of the Wrong Thing failing to
achieve the desired result. Data that have been compromised by a
moderately clever attacker can be used to attack the system itself,
creating accounts for attackers, granting them privileges to the
entire system, and possibly even running commands with superuser
access.
The schema is represented in Figure 1. Here we
describe each of these tables in some detail.
Figure 1:
Schema of Pelendur's Database
|
- People
- contains information about a person. When a person is
added to the table they are assigned a database identifier and a
user login. Those two values will be used to tie a person to their
resources.
- Account
- is a table that contains information about individual
accounts.
- Classification
- contains information on thingies in the system.
A thing can be anything with the exception of a person or an
individual account. Most classifications define a resource or a
membership group. There are also templates and defaults in
Classification that are used to create resources and membership
groups or define certain values in the system (e.g., what the
current quarter is). Each classification is assigned a unique
number (SID) when it is added to the database.
- MembershipIn
- contains the memberships for the membership groups
from Classification. It maps a user login to a SID and is used to
determine what resources that login should have.
- UserRights
- defines owners of classifications and grants access
to users for a classification. This allows a user to manage
resources for their membership groups.
- ResourceUsedBy
- defines what resources a classifications has.
It also indicates if the resource can be edited by the owner or
proxy of the classification.
- ResourceGroup
- links classifications together.
2.2 Flexible
We made an effort to avoid ``hardwiring'' anything in the system.
This was largely accomplished by taking a very dynamic view of the
data. That is, instead of having an ``account'' with a ``disk quota''
field that would be assigned a value based on the state of the system
at the time of the account's creation, all information about an
account must go through a process of resolution, where its
dependencies are determined and the values for each of the account's
properties are resolved at run-time. This view was taken for
everything in the system, not just accounts.
Something else that we incorporated in order to allow maximal
flexibility is a system of property inheritance. Although dealing
with a relational database system at the core, we were able to provide
the ability to inherit properties from a parent by specifying a
relationship between various records in the database by defining a
``parent ID'' (Called ``PID'') as a means of determining which ``SID''
is one step closer to the root than the current.
This gives us the ability to specify object-oriented ``is-a''
relationships between records in the database. Thus, the ``tree'' of
elements to be resolved can be of an arbitrary depth, allowing each
site (and each particular type of resource being managed) to have an
appropriate number of levels to support the sort of abstraction
desired, without forcing some high level of overhead on those who do
not need to deal with such abstractions.
A good example of how we use the ability to inherit properties is in
the case of a series of courses that a student will take in sequence.
There will be some Classfication table entry that will identify the
series. Each course in the series will have its own Classification
entry whose PID identifies the Classification entry of the series as
the parent. Each section in a course will have its own Classification
entry whose PID identifies the Classification entry of the parent
course. Thus, configuration changes are made at the appropriate
level: those that affect all sections in the series will be made in
the Classification entry for the series, those that affect all
sections in a specific course in the series will have those changes
made in the course's Classification entry, and those that are specific
to a particular section will be made in that section's Classification
entry. When we need to determine how much quota, for example, an
account has, we'll determine which sections in which the account has
membership. Those will be resolved by walking up the tree until we
get to the root object (as determined by having a PID of 0),
populating the fields in memory with the values in the database and
returning. By the time the initial Classification entry returns, we
will have queried each level in the tree, populating the object with
the levels specified at the highest level first, and overriding those
with whatever (if anything) was specified in the lower levels.
Additionally, where values are numeric, they need not be absolute;
signed numeric values indicate relative values. Consequently, a
property of a Classification might be to increase disk quota by 30MB,
instead of specifying some absolute value.
2.3 Modular
Rather than requiring code changes in many different places in order
to create new interfaces in the system, we designed the system to be
made up of a basic core which includes the database and the resource
resolution logic. The rest of the system interfaces to that core.
The core understands when an account needs to be created, or what the
properties of an account at any given time should be, but it knows
nothing about creating accounts. Instead, it can tell what the
account's properties are and what system(s) need to reflect those
properties. If the account needs access to both Unix and
Sybase, the core will pass the appropriate information to
the modules for Unix and Sybase through a standardized
interface.
As described in section 2.4, this design makes it possible
for even the database itself to be replaced with another database that
has the same schema.
2.4 Implementation Language
Perl was chosen as the implementation language. Its freely available
DBI (database interface) and DBD (database driver) package for talking
to a wide variety of databases makes it an excellent option for
building atop a database:
- Very little investment is made in an interface to any particular
database (since we code to DBI), thus allowing us (theoretically) to
use any database for which a DBD module exists. In practice, we did
use one Sybase-specific feature, which we can and will in the future
stop using.
- We can spend our time focusing on the problem at hand, instead
of how to write to the database.
- Being free, the price is right.
- Since source code is available, if there is a problem with it
for which a fix is not available, we can fix the bug ourselves.
Because parts of this system will need to run with superuser
privileges, we're concerned about safety of our code. Perl has some
excellent safety features: namely, the ability to identify ``tainted''
data and support for arrays and buffers that grow dynamically.
Finally, Perl is available on a huge number of platforms. As long as
we keep portability in mind, Perl will provide us all of the language
support necessary to allow our code to run unmodified on essentially
any system we could use in the forseeable future.
2.5 Programs
- rosterload
- is the program that loads rosters into the database.
In our environment, rosters are course rosters that identify which
students and instructors should be associated with a given section.
For the most part, rosterload just calls
Roster::rosterload, which does all of the work. Our plans
are for rosterload and Roster.pm to be modified to
be able to load roster information directly out of the Data
Warehouse or from email.
- makeaccounts
- is the program responsible for creating things,
that is, resources and accounts. First it will initialize the
resource methods, then it creates any resources that don't already
exist. Then it creates any accounts that don't exist. We have a
cron job run makeaccounts runs twice per day.
- expireaccounts
- is our garbage collector; it's responsible for
removing things from the system. Expired resources are removed from
both the database and from the systems that were used to support the
account. After this has been accomplished, it will remove entries
in the ``MembershipIn'' table that have expired and then any user
resources that are specified for the user in the database.
cron runs expiresaccounts once per day.
- notifier
- sends account removal notifications. All accounts
scheduled to be removed will receive a seven day notification.
Account that are `persistent' will also receive a 30 and 14 day
notification of expiration.2 Note that this
includes not only accounts for operating systems, but this also
includes ``accounts'' in software systems like Sybase.
As Pelendur continues to help us blur the distinctions
among different systems that comprise the ``computing environment'',
we'll move away from providing notification that specific systems'
accounts will be unavailable and provide notification only on the
user's ``meta-account'' in the department. The Sybase
account will be created as soon as the student shows up on the
roster for a class that has this resource. The account will
always3 match the
Unix login name and will have the typical default password.
- crconfig
- is the course configuration interface for instructors.
This provides a convenient means for them to manipulate the database
in a controlled manner, allowing them to change only that which is
under their administrative authority without creating artificial
restrictions that require staff to perform their administrative
tasks for them. Instructors and their ``proxies'' (those whom they
designate) to add and to remove resources from a particular section
and a course default. Project and course directories are also
configured through this interface.
- cradmin
- is the account management administration tool.
Essentially, this is the interface that is used to manipulate the
state of the database on anything that is in the database. Where
crconfig deals with abstractions and has a
``course-specific'' view of the world, cradmin allows
manipulation of non-course-related resources. The user interface
itself still works with many of these abstractions, but it is in
this program where we can make changes to the user interface to
allow administration of new resources.
2.6 Modules
Here we describe the major modules of the Pelendur system.
Many of these modules are shared by various standalone programs in the
system.
- IICFDB.pm
- is the module contains the generic routines for
interacting with the account management database. There is
typically a routine for each table for searching, adding, and
removing. They follow the naming convention
get_<table-name>, add_<table-name>,
remove_<table-name>. Table names are converted to lowercase
and underscores are used where there would be white space
(``MembershipIn'' becomes membership_in). For some tables an
update function exists also. The get_ routines are all
polymorphic and will do different searches depending on what data is
passed to the routine. This module also contains the routines to
walk the database recursively (resolve_pids) in order to
resolve all dependencies and provide an up-to-the-second view of an
account's properties, as determined by what the database knows about
the account.
- IICFLog.pm
- is our logging mechanism. Presently, this basically
accepts messages and puts them in the ``right place'', but the
intention is that it will be a general-purpose log gatherer for all
applications in our environment.
- IICFLogin.pm
- contains the routines that deals with logins in
our environment. It contains the routines that generates new
usernames and default passwords.
- NT.pm
- contains the routines to do all things on the NT systems.
It is currently not implemented; an older standalone account
creation and deletion system was developed locally for NT accounts.
Instead of implementing this part of the system initially, we opted
to focus on the Unix, Sybase, and core database portions
of the system, building an interface between Pelendur and
the old NT account management scripts. This decision has turned out
to work relatively well for us, and has saved us from what could
have become a large amount of redundant work as Microsoft seems to
think that making major interface changes from version to version of
its operating systems is an appropriate thing to do. Thus, we can
avoid writing most of NT.pm until the NT based systems are on
a newer version of the operating system than the current programs
support.
- PQuota.pm
- interfaces with the printing system's quota handling.
We use LPRng [6] for our printing system throughout the
department.
- ResourceMethods.pm
- is a module that contains all of the methods
for the resources in the account management system. Each resource
has its method defined for it in the
METHOD field in its
``Classification'' entry. When called, this method will `do the
right thing' for the resource. For example, a resource like
``mailing list'' might have a method that specifies a program to be
run in order to create or to delete the mailing list.
- Roster.pm
- contains the routines for processing the roster.
- Sybase.pm
- contains all of the routines for dealing with
Sybase resources.
- Unix.pm
- contains all of the routines for Unix resources. It
adds and removes accounts in the password file, manipulates the
group file, and generates the Quotas file for disk quotas.
- mkcisdir.pm
- is used to create and to remove directories on the
Unix systems. It handles several types of directories: users' home
directories, group project directories, and the directories used by
Submit, our program for electronic laboratory
submissions. Where possible, this module will run as the owner of
the directory instead of root.
3. The Effect of Pelendur
Our environment has benefited tremendously from Pelendur in
many ways. What used to be a painful experience for all is now
essentially a non-event.
3.1 Labor
The total staff labor expenditure for processing accounts between
terms is greatly reduced.
- Accounts to be removed are no longer processed manually. When
an account has no more references (managed through the
MembershipIn table), the account is garbage collected.
- Instructors would manually add and remove students from their
sections using hardcopy provided by the university registrar. We
now get these data from the registrar directly and automatically add
and remove students.
3.2 Error Rate
Historically, this has been a problem. The old software had to run by
someone who knew its idiosyncrasies and limitations. Mistakes were
frequent and could easily require several hours to fix. Errors are
now much less frequent, because we get data directly from the
registrar and do not require any manual intervention before account
configuration. Because the entire state of the system is driven by
the database, errors can now be fixed by making appropriate changes in
the database and waiting for Pelendur to propagate them.
Since managing our systems with Pelendur, we have been able
to identify accounts that have expired long ago but were never
removed, to identify problems that arise because of an account being
misclassified, and generally to free ourselves of the kinds of
concerns that come about when the administrators need to manage things
manually.
3.3 Latency
In section 2.5, we identified which parts of the system
run on a regular basis in our environment. These are configurable to
a site's requirements. In our environment specifically, this means
that changes made take no more than one day to take effect. This is a
huge difference from the days, weeks, or even more that it took under
the old system.
4. Future Work
Quite a lot can still be done with Pelendur. Specifically,
we need to increase the number of systems against which we can
interface, including native NT account management, more intelligent
5. Conclusions
Account management can be greatly simplified by taking a more abstract
view and thinking of system access as a property that results from the
state of the account. Pelendur has proven to be a highly
effective means of managing a very large number of highly variable
accounts.
6. Availability
Although the system has been designed and implemented in a way that
emphasizes flexibility and freedom from very a very site-specific view
of the world, it will still take quite a bit of effort for another
site to bring Pelendur into production. We're currently
working on finishing the functionality and hope that we will be able
to revisit some of the areas of the system that work for us but would
make it difficult or impossible for other sites to use the system
as-is. This work is geared toward making a general release of the
system. We have no idea when this could possibly take place.
- 1
-
Bob Arnold.
Accountworks: Users create accounts on SQL, notes, NT, and
UNIX.
In Twelfth Systems Administration Conference (LISA '98),
page 49, Boston, Massachusetts, December 6-11 1998. USENIX.
- 2
-
D. Brent Chapman.
Majordomo: How I manage 17 mailing lists without answering
``-request'' mail.
In Systems Administration (LISA VI) Conference, pages 135-143,
Long Beach, CA, October 19-23 1992. USENIX.
- 3
-
Matt Curtin.
Creating an environment for reusable software research: A case study
in reusability.
Technical Report OSU-CISRC-8/99-TR21, The Ohio State University,
Department of Computer and Information Science, August 1999.
- 4
-
Daniel E. Geer, Jr.
Service management at project athena.
In Large Installation Systems Administration Workshop
Proceedings, page 71, Monterey, CA, November 17-18 1988. USENIX.
- 5
-
J. Archer Harris and Gregory Gingerich.
The design and implementation of a network account management system.
In 10th Systems Administration Conference (LISA'96), pages
33-41, Chicago, IL, September 29 - October 4 1996. USENIX.
- 6
-
Patrick Powell and Justin Mason.
Lprng - an enhanced printer spooler system.
In Ninth Systems Administration Conference (LISA '95), pages
13-24, Monterey, CA, September 17-22 1995. USENIX.
- 7
-
Paul Riddle, Paul Danckaert, and Matt Metaferia.
AGUS: An automatic multi-platform account generation system.
In Ninth Systems Administration Conference (LISA '95), pages
171-180, Monterey, CA, September 17-22 1995. USENIX.
- 8
-
Mark A. Rosenstein, Daniel E. Geer, Jr., and Peter J. Levine.
The athena service management system.
In USENIX Conference Proceedings, pages 203-211, Dallas, TX,
Winter 1988. USENIX.
- 9
-
Henry Spencer.
Shuse: Multi-host account administration.
In 10th Systems Administration Conference (LISA'96), pages
25-32, Chicago, IL, September 29 - October 4 1996. USENIX.
- 10
-
Henry Spencer.
Shuse at two: Multi-host account administration.
In Eleventh Systems Administration Conference (LISA '97),
page 65, San Diego, California, October 26-31 1997. USENIX.
- 11
-
Gregory S. Thomas, James O. Schroeder, Merrilee E. Orcutt, Desiree C. Johnson,
Jeffrey T. Simmelink, and John P. Moore.
UNIX host administration in a heterogeneous distributed computing
environment.
In 10th Systems Administration Conference (LISA'96), pages
43-50, Chicago, IL, September 29 - October 4 1996. USENIX.
- 12
-
John Viega, Barry Warsaw, and Ken Manheimer.
Mailman: The GNU mailing list manager.
In Twelfth Systems Administration Conference (LISA '98), page
309, Boston, Massachusetts, December 6-11 1998. USENIX.
Pelendur: Steward of the Sysadmin
This document was generated using the
LaTeX2HTML translator Version 98.1p1 release (March 2nd, 1998)
Copyright © 1993, 1994, 1995, 1996, 1997,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 0 -no_navigation -show_section_numbers pelendur.tex.
The translation was initiated by Matt Curtin on 2000-11-27
Footnotes
- ... Curtin1
- Now at Interhack Corporation.
- ... expiration.2
- The amounts of time on the
notification are configurable parameters.
- ... always3
- ``Always'' is a pretty strong word. There are a few
exceptions, as the result of ancient accounts that predate
Pelendur that still contain dashes (-) so they
can't be used as logins to Sybase. If the username has
a dash in it, the dash will be converted to an underscore
(_) for the Sybase account. New accounts are
always created with names that are portable across our systems so
that these kinds of conversions will not be necessary.
corporate |
research |
news |
people |
projects |
publications |
services |
feedback |
legal
Matt Curtin
2000-11-27