Mike's IRTF Troubleshooting Guide

I'm sorry that you're here. That means you're probably having a bad day. But I'm here to help!

SpeX

SpeX Shutdown Instructions

SpeX Power Up Instructions

Good temperatures

Bigdog Array:             37 K
Guidedog Array:           30 K
LN2:                      74 K
Cold Head:                10.7 K
Bigdog setpoint/heater:   37 K, 37%
Guidedog setpoint/heater: 30 K, 24%

Starting up

  - click on the ssh bigdog icon on the desktop
  - enter project password 
  - Follow the directions to start the 4 processes
  - Start the IC first!           startic
  - Then start the IARC server.   startiarc
    If the IC isn't running, the IARC server may have problems starting.
  - Then start the XUI and DV.  

- Most frequent error, is when we've shut down power, and there's confusion as to what the TO should do

Remote power control

  - Open an xterm
  - type 'rpcm'
  - choose the instrument
  - click on get status
  - SpeX H2RG and Aladdin share an outlet, but we don't know which of the two is the active plug, so
    turn on/off both of them.  
    - We put a dongle in to send a signal to the power supplies if...
  - Wait ~ 10 seconds
  - Hit go.init
  - SpeX: Need to initialize motor controller (initialize all mechanisms) if you power cycle the motor controller
          Shouldn't need to do more software wise.  

Sometimes you can have a second instance of the software

    You get weird user behavior (go doesn't work, problems saving files, etc.)
    To check if there is another instance of some software:
    - click on 'ssh bigdog' icon on Bigdog desktop.  Enter project password.  
    - /home/bigdog/bin
    - report_bigdog to list the processes that are running (similar to a ps -a)
    - Should only be one of each process running
    - If there are any duplicates, then kill ALL processes (the duplicates AND everything else too)
      kill -9 processidnumber

Shared memory

- If startic fails, especially after a few times, then it's most likely a shared memory issue.  
- /dev/shm there is the shm_ic (shared memory)
- If so, then run this: /home/bigdog/bin/cleanic
          cleanic does: rm /dev/shm/iarc_server_sm
                        rm /dec/shm/shm_ic

Littledog

- Once in a while ldog will glitch, or the PC has been powered down
   - Check that the temperature age is reasonable.  
   - First make sure ldog is running, and it handles temp stuff, so it needs to be running before
     the Bigdog/Guidedog software will work.  
- Littledog has a VNC session running on it
- open a generic xterm
- vncviewer ldog:16000
- enter the project password (NOT the VNC password of the day)
- open an xterm in the ldog VNC
- follow the directions to su to ldog, and restart the ldog ic

iSHELL

iSHELL Troubleshooting Instructions

iSHELL Mechanism & Temperature Issues

iSHELL Startup Instructions

Good temperatures

Kyle Array:               30 K
Kyle setpoint/heater:     30 K, 72%
Cartman Array:            37 K
Cartman setpoint/heater:  37 K, 68%
SIG setpoint/heater:      80 K, 92%

To check for/clean duplicate processes

  - Click on the 'ssh cartman' icon on the cartman desktop
  - Enter project password
  - /home/cartman/bin
  - report_cartman
     - if you see duplicate processes running, 
  - killic
     - kills everything in the report_cartman list
     - might need to run it twice
  - cleanic clears out shared memory

Remote power control

  - Open an xterm
  - type 'rpcm'
  - choose the instrument
  - click on get status
  - SpeX H2RG and Aladdin share an outlet, but we don't know which of the two is the active plug, so
    turn on/off both of them.  
    - We put a dongle in to send a signal to the power supplies if...
  - Wait ~ 10 seconds
  - Hit go.init
  - iSHELL: Need to initialize motor controller (initialize all mechanisms) if you power cycle the motor controller
          Shouldn't need to do more software wise.  

Temperature Database Bug

  - If the temperature monitoring page stops updating
  - Open an xterm in an iSHELL VNC
  - vncviewer kenny:1
  - enter the project password
  - login as kenny2, using the project password
  - cd to /home2/kenny2/bin
  - restart_temp_monitoring 

Other scripts (kill/start/restart)

  - report_kenny: lists the processes that are running in kenny, to check that the mechanisms are all running ok
  - to kill a process:
    - kill_all
    - kill_calibration_mechanism_servers
    - kill_cold_mechanism_servers
    - kill_kenny
    - kill_temp_monitoring
  - Restarting example
    - type in 'restart_cold_mechanism', hit enter.  You'll get a list of the mechanism numbers and names.  
      - then include the number or the name of the service, so 1 for image rotator.  
         - Example: restart_cold_mechanism 1 or restart_cold_mechanism imagerotator
    - If you restart a mechanism, then reinitialize it in the GUI.  

MIRSI


- If temp is 4.5 (too cold), then array power is off.  
  We're paranoid that the temp dongle won't work, or we'll zap the array
  We often keep power off, just because we're paranoid

- MIRSI's PC is currently MIRSI2BU (BU=backup)

- /home/mirsi/bin
  - reportic
  - cleanic

- If there is a temperature/mechanism issue with MIRSI, then
  - vncviewer mirsi2bu:1
    - ls /home/mirsi/bin
    - top xterm is temp control
    - bottom xterm is mechanism control
    - control-C in the xterm will kill it.  
    - to start from new
      - open an xterm
      - run runtemperatures
      - run runmechanisms
      - reinitialize all mechanisms in the GUI
        - slit wheel takes a few minutes

Other

Coolracks Power

  - TOs often forget that the shiny green buttons are on
    - There are 8 total, two on each coolracks
    - This is a pass through to allow power to go to that rack
    - Power issues, temperature issues inside that cabinet could cause the button to go off.  
  - Under 3 Coolracks, are LED buttons
  - SW is set back behind a plate, and the other is on the left hand side
    - SW is mostly Smokey and TEXES

The 'die' command

  - tells all of the startic processes to kill themselves, and kills the GUI

Restarting VNC

  - in an xterm (even as our own users), type 'vnc'
    This will list all of the VNC sessions that we have
    This is a program that Miranda wrote that lists the VNC sessions, and an example of how to kill/start a VNC
  - Example: to kill moc's VNC:
    vnc kill vnc-moc
    vnc start vnc-moc

If we can't save files

  - Probably a disk is full
  - Call Miranda immediately

Questions for Charles?

- What does the dongle do?
   -> Should prevent power going on if temps are bad
- Restarting iSHELL temp monitoring
- Restarting motor servers (don't have problems often, so we rarely do this)
- Restarting MIRSI anything
- When does a PC need to be rebooted?  
   - Hardly never
   - Stefan: only the once-a-month reboot
             hosts the VNC sessions
             if there is a problem, should kill that VNC session, and restart that VNC session
- When to turn off/on array power?
  - Who's responsibility will that be?
  - If there is bad weather (esp. lightning) or power outage, then he turns off power.  
  - Might want to turn off MORIS too. 
  - >> Always check temperatures.  If temps are ok, then go ahead and turn power on
Last modified 3 Jan 2025