Chikayama-Taura Laboratory Cluster User’s Guide (draft) Dec, 2nd, 2003 Kenjiro Taura (firstname.lastname@example.org), Yuuki Horita (email@example.com), Masashi Suekane (firstname.lastname@example.org) Department of Information Communication and Engineering University of Tokyo 1 About this Document Chikayama-Taura laboratory at University of Tokyo, Japan, is contributing a PC cluster system to the ApGrid Testbed. This document is a users' guide for the resource and includes (1) requirements for users, (2) procedures for getting an account, (3) hardware/software specifications and available Grid software, (4) resource usage policies, and (5) support staff available and mailing lists provided to users. According to future administrative decisions, Chikayama-Taura laboratory may change resource configurations, usage policies, and distribute a revised version of this document through the users' mailing list. 2 Requirements for Users The Chikayama-Taura laboratory requests all users to fulfill the following conditions. All users must belong to one of member organizations of JpGrid/ApGrid project. All users participating in ApGrid must comply with the operating principles of ApGrid. All users must have concrete purposes for using resources at Chikayama-Taura laboratory. All users must contribute to JpGrid/ApGrid through the use of resources at Chikayama-Taura laboratory. All users must have a user certificate issued by an ApGrid trusted CA.* A list of ApGrid trusted CAs and procedures for obtaining a user certificate are available on the ApGrid home page. 3 How to Obtain Your Account In order to obtain an account for using resources at Chikayama-Taura laboratory, send the following information to email@example.com (1) Full Name (2) Affiliation (3) Address (4) Country (5) Phone and facsimile number (6) Email address (7) Purpose for using the resources (8) Contribution to JpGrid/ApGrid (9) Desired account name (1st: 2nd : 3rd choices:)* (10) Login shell (bash, csh, tcsh, etc.) (11) OpenSSH public key* ** (12) User certificate issued by an ApGrid Trusted CA. *** The application will be reviewed by the administrators. The administrators will verify whether the applicant fulfills the requirements described in Section 2. If the application is approved, the administrators will create an account and send the necessary information to the applicant. Otherwise, the applicant will be notified that the request is rejected. Basically, a newly created account will be valid until the end of the Japanese fiscal year (the end of March). You are expected to move your files from the resources at Chikayama-Taura laboratory to your resources by the expiration of your account. If you want to keep your account and files, you are required to renew your account every March. * Users can login resources at Chikayama-Taura laboratory only via ssh (ver. 2 protocol). The initial UNIX password will not be informed to users. ** An SSH public key and a user certificate should be sent as attachments to the application, rather than embedded in the body of the Email. The public key should be in the OpenSSH format, rather than the commercial secure shell format. If you only have the latter, convert it to the former by ssh-keygen command that comes along with the OpenSSH distribution. *** A mapping of a user's subject name to the local account will be added to the grid-mapfile on the resources at Chikayama-Taura laboratory. 4 Resource Information The following information is subject to a slight change without notice. Major changes will be announced. We provide a cluster called “marten.” It consists of one server node and (currently) nine compute nodes. The server is named “marten.logos.ic.i.u-tokyo.ac.jp” and the compute nodes marten02, marten03, marten04, … Note that the compute node names start with marten02; “marten01” is an alias to the server node. All nodes run Linux Debian, with kernel version 2.4.18-bf2.4, and libc6 version 2.2.5-11.5. Nodes are Dell Inspiron 8200 (P4 1.7GHz, 640MB memory) or IBM Thinkpad T23 (PIII 866MHz, 640MB memory). They are connected via 3COM 3C17100 48port 100Mbps switch. All nodes are run inside VMWare™, hosted by Windows XP. The users have no logins to the Windows XP hosts. We would like to get feedback on network and file system performance. The users can login the server node via SSH version 2 protocol. From the server node, you can then login compute nodes either via rsh or SSH. The server is the NIS server of the domain “MartenCluster” and all nodes are its client, sharing logins, hosts, NFS maps, etc. All nodes share the home directory and /usr/local directory, being hosted by the server. The users should consider compute nodes are volatile (or “stateless”). We may daily add/delete/replace/upgrade compute nodes without maintaining their local states (file system). The users are free to put files in their local file systems, but should always assume they may not persist. Information that is supposed to persist should be put in the server. In contrast, we try to maintain the server file system, but do not assume it is such a reliable file system. Always maintain at your site a copy of information you never want to lose! PBS is installed. The PBS server is running on the master node marten01. PBS PATH is “/opt/pbs”. Globus Tooklit version 2.4 is installed on the master node marten01. GLOBUS_LOCATION is “/usr/local/globus”. Jobmanager-pbs is available and configured as the default jobmanager on the master node marten01. Marten Cluster Gateway node: marten.logos.ic.i.u-tokyo.ac.jp(188.8.131.52) ( = marten01.logos.ic.i.u-tokyo.ac.jp) Master node: marten01.logos.ic.i.u-tokyo.ac.jp(192.168.0.1) Pentium® 4 1.70GHz, 640MB. Linux kernel 2.4.18-bf2.4 (Debian Linux) NFS server, NIS server, PBS server, Globus gate keeper(Globus Tool Kit 2.4) Computing nodes: marten02.logos.ic.i.u-tokyo.ac.jp (192.168.0.2) marten03.logos.ic.i.u-tokyo.ac.jp (192.168.0.3) marten04.logos.ic.i.u-tokyo.ac.jp (192.168.0.4) marten05.logos.ic.i.u-tokyo.ac.jp (192.168.0.5) marten06.logos.ic.i.u-tokyo.ac.jp (192.168.0.6) marten07.logos.ic.i.u-tokyo.ac.jp (192.168.0.7) marten08.logos.ic.i.u-tokyo.ac.jp (192.168.0.8) Pentium®Ⅳ 1.70GHz, 640MB. Linux kernel 2.4.18-bf2.4 (Debian Linux) marten09.logos.ic.i.u-tokyo.ac.jp (192.168.0.9) marten10.logos.ic.i.u-tokyo.ac.jp (192.168.0.10) Pentium®Ⅲ 866MHz 640MB Linux kernel 2.4.18-bf2.4 (Debian Linux) 5 Resource Usage Policy The server node serves as a PBS server and all compute nodes PBS clients. You are free to use them. However, all compute nodes may be used in more ad-hoc bases, with rsh, SSH, or other experimental tools for remote job submissions. Do not assume all users use compute nodes through a batch queue or a central resource manager. Users who wish to use compute resources exclusively for a period of time should send requests to the users mailing list: firstname.lastname@example.org. An example for the email is given below. Users also need to send emails when they finish running their programs. --------------------------------------------------------------------------------------------- I would like to exclusively use the following nodes. Please let me know if this is a problem. Cluster : marten Nodes : marten02-marten06 Duration : Jul. 30th 10:00 - Jul. 30th 16:00 (JST) Comments : Other users’ running light-weight jobs background is OK. --------------------------------------------------------------------------------------------- 6 Support Staff and Mailing Lists The Marten cluster is maintained by support staffs. Your requests will be processed in a best- effort manner by the staffs. Chikayama-Taura laboratory provides the following two mailing lists for users: email@example.com This mailing list is used mainly for announcements from administrators and support staffs and for notifications from users to use the cluster exclusively. All users are automatically included in the list. firstname.lastname@example.org System administrators of the resources are on this mailing list. This mailing list is used for correspondence from all users. Any requests, questions, and comments should be sent to this mailing list.