Computational Resources

FACILITIES & OTHER RESOURCES

University of SOUTHERN CALIFORNIA – INSTITUTE FOR NEUROIMAGING & INFORMATICS

Figure1_Soto
Figure 1. The new Soto Street Building on the USC Health Sciences Campus in Los Angeles, CA.

In September, 2013, the University of Southern California (USC) in Los Angeles created the Institute for Neuroimaging and Informatics (INI), a world-class facility in which all the techniques for modern ultra-high-field and standard-field MRI brain scanning, image analysis, databasing, mathematics, genomic analysis, computation and medical imaging will be housed under one roof. Dr. Arthur Toga was recruited to USC along with 8 other faculty that previously had formed the Laboratory of Neuro Imaging (LONI) at UCLA to found this new Institute. In March, 2015, the university received a $50M transformative gift from Mark and Mary Stevens, changing the name of the institute to the USC Mark and Mary Stevens Neuroimaging and Informatics Institute.

USC is remodeling a legacy building and building new facilities to house office, lab space, an MRI suite, and computational resources at the Keck School of Medicine. The overall academic environment will be a remarkable specially designed modern building of 35,000 gross sq ft. While the Raulston building is being remodeled, temporary space is allocated within the Soto 1 building (Figures 1 and 2) on the Health Sciences Campus. The allocation of significant campus funding to create this Institute, and its facilities, creates the opportunity to establish a broad and rich multidisciplinary expertise. The Keck School of Medicine has dedicated significant campus resources, capital infrastructure and space, to LONI and the Stevens Neuroimaging and Informatics Institute.

The new Institute provides an extensive infrastructure designed and operated to facilitate modern informatics research and support for hundreds of projects including several multi-site national and global efforts. We have redundancies built in to all equipment, and a completely secure facility to protect equipment and data. The resources described below provide networking, storage and computational capabilities that will ensure a stable, secure and robust environment. It is an unprecedented test bed to create and validate big data solutions. Because these resources have been designed, built and continuously upgraded over the years by our systems administration team, we have the appropriate expertise and operating procedures in place to use these resources to their maximum benefit.

Physical Infrastructure of the Soto Building at USC: The INI is physically located in the Soto Street Building (Figures 1 and 2) on the USC Health Sciences Campus, occupying two suites of the building’s ground floor. The space is comprised of a reception area, two Director’s Offices area, faculty offices, a modern data center, a user area and conference rooms. The data center contains a 300 KVa UPS/PDU capable of providing uninterruptible power to mission-critical equipment housed in the room, dual 150KVa connections to building power, an 800kW Caterpillar C27 diesel backup generator, three Data Aire computer room air conditioning (CRAC) units, humidity control and a Cosco fire suppression and preaction system. A sophisticated event notification system is integrated in this space to automatically notify appropriate personnel of any detrimental power and HVAC issues that arise.

Data Center Security: The LONI datacenter is secured by two levels of physical access, to ensure HIPAA compliance for data security. The main facility is secured 24/7 with access control devices. Only authorized personnel are allowed in, and guests are permitted only after checking in, and only during business hours. The datacenter itself is additionally secured by a second layer of proximity card access. Only authorized staff are permitted to enter the datacenter facility. Individual racks containing HIPAA data are secured by lock and key to prevent cross access.

Figure2_SotoFloorplan
Figure 2. Architectural design of the temporary office space in the Soto building at USC.
LONI spans suites 101 and 102 of the first floor.

Computational and Storage Resources: Rapid advancements in imaging and genetics technology have provided researchers with the ability to produce very high-resolution, time-varying, multidimensional data sets of the brain. The complexity of the new data, however, requires immense computing capabilities.

The compute infrastructure within the datacenter boasts 3,328 cores and 26 terabytes of aggregate memory space. This highly available, redundant system is designed for demanding big data applications. Blades in the Cisco UCS environment are easy to replace. A failing blade sends an alert to Cisco where a replacement ticket is generated automatically. Upon arrival, the new blade can go from the shipping box to being fully provisioned and in production in as little as 5 minutes.

Institutions and scientists worldwide rely on the Institute’s resources to conduct research. The Stevens Neuroimaging and Informatics Institute is architected using a fault-tolerant, high-availability systems design to ensure 24/7 functionality. The primary storage cluster is 33 EMC Isilon nodes with 3.8 usable petabytes of highly available, high performance storage. Data in these clusters moves exclusively over 10g links excepting node to node communication in the Isilon cluster which is handled by QDR Infiniband, providing 40 gigabit bidirectional throughput on each of the Isilon cluster’s 66 links. Fault tolerance is as important as speed in the design of this datacenter. The Isilon storage cluster can gracefully lose multiple nodes simultaneously without noticeably affecting throughput or introducing errors. An EMC VNX 15 terabyte SAN cluster with tiered solid state disk storage complements the storage environment, providing another avenue of redundant storage offered across differentiated networking to provide another layer of resilience for the data and virtualization infrastructure.

External services are load balanced across four F5 BIG-IP 2200S load balancers. The F5 load balancers provide balancing services for web sites, applications, as well as ICSA-certified firewall services. The core network is entirely Cisco Nexus hardware. Each of the two Cisco Nexus 5596s supports 1.92 terabits per second of throughput. Immediately adjacent to this machine room is a user space with twelve individual stations separated by office partitions. These workspaces are manned by staff who constantly monitor the health of the data center as well as plan for future improvements. Each space is also equipped with a networked workstation for image processing, visualization and statistical analysis.

Network Resources: Service continuity, deterministic performance and security were fundamental objectives that governed the design of the network infrastructure. The Institute’s intranet is architected using separate edge, core and distribution layers, with redundant switches in the edge and core for high availability, and with Open Shortest Path First (OSPF) layer 3 routing, instead of a traditional flat layer 2 design, to leverage the fault tolerance offered by packet routing and to minimize network chatter. While ground network connectivity is entirely Gigabit, server data connectivity is nearly all 10 Gigabit fiber and Twinax connected to a core of 2 Cisco Nexus 5596 switches, 10 Cisco Nexus 6628 switches, and 6 Cisco Nexus 2248 fabric extenders. For Internet access, the Institute is connected to the vBNS of Internet2 via quad fiber optic Gigabit lines using different route paths to ensure that the facility’s external connectivity will be maintained in the case of a single path failure.

The facility has two Cisco Adaptive Security Appliances providing network security and deep packet inspections. The Stevens Neuroimaging and Informatics Institute has also implemented virtual private network (VPN) services using SSLVPN and IPsec services to facilitate access to internal resources by authorized users. A VPN connection establishes an encrypted tunnel over the Internet between client and server, ensuring that communications over the Web are secure.

Furthermore, the Institute has an extensive library of communications software for transmitting data and for recording transaction logs. The library includes software for monitoring network processes, automatically warning system operators of potential problems, restarting processes that have failed, or migrating network services to an available server. For instance, the laboratory has configured multiple web servers with Linux Virtual Server (LVS) software for high-availability web, application and database service provisioning as well as load balancing. A round-robin balancing algorithm is currently used such that if the processing load on one server is heavy, incoming requests, be it HTTP, JSP or MySQL, are forwarded to the next available server by the LVS software layer. Listeners on one virtual server monitor the status and responsiveness of the others. If a failure is detected, an available server is elected as master and it assumes control and request forwarding for the entire LVS environment.

Offsite and Onsite Backup Resources: All critical system data and source code is backed up regularly to local nearline storage and cloned to LTO6 magnetic tape for offsite archival with Iron Mountain. Offsite backup data is kept in current within one week to enable rapid redeployment of all services with a view of operations and data as of a reasonable time frame. The EMC Isilon nodes retain snapshots of one month’s worth of data to offer the most rapid, but least disaster-resilient restoration. Snapshot retention allows rapid restoration of unintentionally overwritten or deleted data and does not require retrieval from archived tape. Nearline-stored backup data provides similarly rapid restoration of the prior year’s on-site data. Offsite archived magnetic media, while the last resort, can be recalled the same day and offers retrieval of data from a date range as recent as the prior week to as old as the initial archival series of tapes over one year old. Onsite datasets grow rapidly, constantly and require a flexible tape backup solution. Clones are pushed to tape by way of a pair of EMC backup accelerators to an expandable 10 LTO6 drive Quantum i500 tape array. The array provides the backup parallelism the expansive data collection requires for offsite archival on such an aggressive schedule.

Virtualized Resources: Due to the rate that new servers need to be provisioned for scientific research, the Institute deploys a sophisticated high availability virtualized environment. This environment allows systems administrators to deploy new compute resources (virtual machines or VM’s) in a matter of minutes rather than hours or days. Furthermore, once deployed these virtualized resources can float uninhibitedly between all the physical servers within the cluster. This is advantageous because the virtualization cluster can intelligently balance virtual machines amongst all the physical servers, which permits resource failover if a virtual machine becomes I/O starved or a physical server becomes unavailable. The net benefit for the Institute is more software resources are being efficiently deployed on a smaller hardware footprint, which results in a savings in hardware purchases, rack space and heat expulsion.

The software powering the virtualized environment is VMware’s ESX 5. The ESX 5 is deployed on eight Cisco UCS B200 M3 servers, each with sixteen 2.6/3.3 GHz CPU cores and 128GB of DDR3 RAM. These eight servers reside within a Cisco UCS 5108 blade chassis with dual 8×10 Gigabit mezzanine cards providing a total of 160 Gigabits of available external bandwidth. Storage for the virtualization cluster is housed on the 23 nodes of Isilon storage. The primary bottleneck for the majority of virtualization solutions is disk I/O and the Isilon cluster more than meets the demands of creating a highly available virtualized infrastructure whose capabilities and efficiency meet or greatly exceed those of a physical infrastructure. A single six rack unit (6RU), eight blade chassis can easily replicate the resources of a 600+ server physical infrastructure when paired with the appropriate storage solution such as the Isilon storage cluster.

Figure3_Network Figure 3. LONI/INI network infrastructure and supercomputing environment.

Workflow Processing: To facilitate the submission and execution of compute jobs in this compute environment, various batch-queuing systems such as SGE (https://arc.liv.ac.uk/trac/SGE) can be used to virtualize the resources above into a compute service. A grid layer sits atop the compute resources and submits jobs to available resources according to user-defined criteria such as CPU type, processor count, memory requirements, etc. The laboratory has successfully integrated the latest version of the LONI Pipeline (http://pipeline.loni.usc.edu) with SGE using DRMAA and JGDI interface bindings. The bindings allow jobs to be submitted natively from the LONI Pipeline to the grid without the need for external scripts. Furthermore, the LONI Pipeline can directly control the grid with those interfaces, significantly increasing the operating environment’s versatility and efficacy, and improving overall end-user experience. See Figure 4 for a screenshot of the latest version of the pipeline.

Figure4_Pipeline Figure 4. The LONI Pipeline Execution Environment

New Facility: The new construction in the Raulston Memorial Research Building will be the permanent home of the Stevens Neuroimaging and Informatics Institute. This new facility represents a significant investment on the part of the university. The Raulston renovations are projected to cost nearly $50M and take 18 months to complete. The entire façade as well as interior of the building will be upgraded to complement the science being conducted at the Institute. The new facility will house a data center, a state-of-the art theater and workspaces.

The data center will be approximately 3,000 square feet and is being designed using cutting edge high density cooling solutions and high density bladed compute solutions. A total of 48 racks will be installed and dedicated to research use. Of the 48, 10 racks will be reserved for core services. The core services are on separate, dedicated, redundant power to ensure continuous operation. The current design of the data center includes a Powerware 9395 UPS system providing two 750kW/825kVA UPSs in an N+1 configuration for non-core racks and two 225kW/250kVA in a 2N configuration for core services racks. The UPS sends conditioned power to 300kVA Power Distribution Units (PDUs) located inside the data center. The PDUs feed 400A rated Track Power Busways mounted above rows of racks providing an “A” bus and a “B” bus for flexible overhead power distribution to the racks. The design calls for the use of VRLA batteries with 9 minutes of battery run time for the core services UPS and 6 minutes of battery run time for the non-core UPS (note that the generator requires less than 2 minutes of battery run time in order to fully take over the load in the event of an outage). A 750kW/938kVA diesel emergency generator located in a weatherproof sound attenuated enclosure adjacent to the building will provide at least 8 hours of operation before needing to be refueled.

The Cisco UCS blade solution described above allows the Institute to run the services of a much larger physical infrastructure in a much smaller footprint without sacrificing availability or flexibility. Each Cisco chassis hosts 8 server blades and has 160 gigabits of external bandwidth available per chassis. Each of the 48 racks can hold up to 6 chassis plus requisite networking equipment (4 fabric extenders). Thus, the new data center has adequate rack space to accommodate this project.

In addition to a new data center, the new space in Raulston will house a 50-seat high definition theater – the Data Immersive Visualization Environment (DIVE). The prominent feature of the DIVE is a large curved display that can present highly detailed images, video, interactive graphics and rich media generated by specialized research data. The DIVE display will feature a dominant image area, with consistent brightness across the entire display surface, high contrast, and 150° horizontal viewing angle. The display resolution target is 4k Ultra HD, 3840×2160 (8.3 megapixels), in a 16:9 aspect ratio. Due to the ceiling height requirements, the DIVE will require two floors of the building. The DIVE is designed to facilitate research communication, dissemination, training and high levels of interaction.

Adjoining the Raulston Memorial Research Building on the north side is a brand new MRI facility. The Center for Image Acquisition (CIA) will house a Siemens Magnetom Prisma, a new 3 Tesla MRI scanner, and a Siemens Magnetom 7T MRI scanner. The MAGNETOM Prisma 3T includes unparalled simultaneous 80mT/m @ 200T/m/s gradients, a new, high end gradient system that delivers high gradient amplitudes and fast switching capabilities in a combination that is currently truly innovative. The Siemens Magnetom 7T MRI system is an investigational device. Both MRI scanners are at the leading edge for neuroimaging

The Raulston building will house upwards of 90 faculty, postdocs, students and staff in a variety of seating configurations from private offices to shared offices to open workspaces designed for collaboration. Great care was taken to design the people space to facilitate ongoing discussion and collaboration while providing a peaceful work environment for all.

A rendering of the new building included below (Figure 5). The Raulston building is scheduled to open in 2016.

Figure5_Raulston

Figure 5. Future home of the Insitute for Neuroimaging and Informatics showing the data center (south side),
DIVE, and the Center for Image Acquisition (north side).

Additional Office Spaces: Additional office spaces for faculty, staff and students is available in Suite 400 of the Marina Towers in Marina Del Rey, CA. This building houses other high tech research groups like the Information Sciences Institute and serves as a West LA hub for the Stevens Neuroimaging and Informatics Institute. Suite 400 is nearly 5,200 sq ft with 14 faculty and shared offices, 2 director’s offices, 2 conference rooms, a student workroom with 8 workstations and a large kitchen.