Hardware Test Engineer I/II

Space Exploration Technologies Corporation

Summary

Timeline

Started

Satellite Hardware Test Team

Owned test systems for four generations of Starlink flight computers and two generations of power boards

Transitioned To Remote

Moved To Oregon

Personal decision, but I was allowed to work on tools for the build reliability engineering team

Changed Teams

Components Test Infra Team

Vertical move that allowed for broader application of my skills

Finished

Thanks for all the fish!

Celebrated five and a half years of helping put thousands of satellites, and dozens of rockets, into orbit

Key Takeaways

  • Created test systems which validated ~4500 Starlink satellite flight computers, and ~4000 power boards
  • Developed program-critical infrastructure that enabled efficient triage, management, tracking, and metrics of hardware failures
  • Designed and deployed automated, unified, and containerized infrastructure to greatly increase application reliability and development speed
  • Provided on-call support for on-orbit hardware, satellite-test/components-test systems, and satellite-test/components-test infrastructure

Relevant Skills

Software & Environments

  • Version Control
    • Git
    • Subversion
  • Programming
    • Languages
      • Python 2/3
      • Bash Shell Scripting
      • High-Level Embedded C/C++ (Arduino/Teensy)
      • HTML
      • CSS
      • Typescript/Javascript
    • Databases
      • Postgres
      • Microsoft SQL
      • MySQL
    • DevOps
      • Docker
      • Ansible
      • Kubernetes
  • Methodologies
    • Test Driven Development
    • Agile
  • Operating Systems
    • Linux
      • Centos
      • Ubuntu
    • Microsoft Windows
  • Applications
    • Atlassian
      • Jira
      • Bitbucket
      • Confluence
    • Monitoring/Metrics
      • Grafana
      • Sentry
      • OpsGenie
    • Secrets
      • Bitwarden
      • Hashicorp Vault
    • Other
      • Microsoft Office Suite
      • Smartsheet
Electrical

  • Schematic & PCB Design
    • Software
      • Altium Designer
      • Mentor Graphics PADS
    • PCB Features
      • Multi-layer (up to 16)
      • Impedance Controlled Designs
      • High Power Designs
      • Power Simulation
    • Manufacturing
      • Gerber Export
      • BOM Management
      • Board-House Assembly
      • In-House Assembly
  • Electrical Diagnostics
    • Multimeters
    • Electronic Loads
    • Oscilloscopes
    • Logic Analyzers
    • LCR Meters
  • Harnessing Fabrication
    • DC Low-Power & Signal
    • DC High-Power
    • Sub-Microwave RF
    • Vacuum Rated Harnesses
    • Vacuum Chamber Harness Passthroughs
  • Test Rack Equipment
    • Custom Fixture Design/Fabrication
    • Single/Bi-Directional Power Supplies
    • Electronic Loads
    • NI Test Hardware
    • Labjack
Mechanical

  • Siemens Teamcenter PLM
  • 3D Modeling
    • Siemens NX
    • Protocase Designer
    • FreeCAD
  • Fabrication
    • Laser Cutting
    • 3D Printing
    • CNC
    • Waterjet
    • Hand Tools
Other

  • Specialty Tests
    • Highly Accelerated Life Testing
    • Thermal Vacuum Chamber Testing
    • Vibration Testing

Details

Starlink Test Engineering

  Starting at SpaceX, I'd joined the hardware test engineering team for the Starlink program in Redmond, WA. With my first day being in early September of 2019, the V0.9 revision satellites had only just been shot into space a few months prior. This meant my start coincided with earnest preparations to launch V1.0, as well as ramping production to rates previously thought impossible for space industries. My initial task at the company was simple; offload the current hardware test engineer for flight computer in any way possible, as he was significantly overloaded.
  I put my all into this task, immediately taking over quite a few production code debug and improvement tasks, as well as running EMI compliance testing, while absorbing as much as I could about the product, its test systems, and test procedures. During this time, I was extremely happy to have already done a test engineering internship with the company, as it made ramping up much faster and easier than it would have been otherwise. Around three months after joining the team, ownership of these test systems were transferred to me in their entirety. What followed was nearly three years of the kind of intensity you'd expect out of a company like SpaceX, but with the unique addition of higher production rates than anywhere else in the company, or space industry.
  When I say I took ownership of flight computer test systems, I mean this very literally, and specifically a multi-disciplinary sense. I handled high-level test rack designs, purchasing, test fixture enclosure design, test fixture PCB design, test fixture firmware development, test rack networking configuration, test rack server configuration and tuning, DUT harness design, prototype DUT harness fabrication, along with test software development to cover not only production tests, but also more strenuous qualification tests which often required their own custom rack design compared to their production counterparts. This barely scratches the surface in reality, and doesn't include the countless nights on the production floor, personally running hundreds of boards through functional and environmental testing, slowly and consistently improving the quality, rate, and coverage of the tests. It also doesn't include the whole year where I was on-call 24/7 for a full week, every other week, triaging on-orbit alerts for flight computer, reviewing data, and sending commands to recover where needed.
  As time progressed, I began to burn out and requested transfer to a different product. Flight computer was a particularly intense due to having little-to-no overlap with the rest of the satellite sub-systems, and being doubly critical functionality-wise to satellite operation. I was given a new test system for power boards, but due to unfortunate timing, the new owner for flight computer changed companies and I ended up having to temporarily take over test rack hardware design, alongside permanently handling the software for flight computer through the end of this overall phase at SpaceX. This turned out to be the most intense period of time at the company for me, with some of the longest hours and latest nights that I'd rather not share the durations of. However, as with all the time previously, I successfully delivered multiple test system derivations for power board, alongside all the software for multiple of the next generation of flight computer.

Build Reliability Engineering

  In early 2022, I decided I was going to move in with my long-distance partner down in Oregon. SpaceX is not normally known for allowing remote work, but I floated the idea a few months before my August deadline for leaving, just in case. My good friend, and head of the build reliability engineering team at the time, also proposed a project for me to work on that would enable fast triage and disposition of hardware failures coming out of test. I'd previously taught him some basic web development to get a prototype version of the app up and running, but it was fast approaching its usability limits and needed professional support. They approved me to work on this for three months, with the stipulation that I came in person for one week a month. Since it was going to be short term, I agreed and began working on this website!
  This level of web development was rather new to me, at the time, as I'd always said I'd never become a full time web developer since before even starting college. Still, that wasn't going to stop me, and I quickly began making improvements. One thing SpaceX had already taught me was that development velocity was paramount, and that there should be as little friction as possible between changes getting made, becoming validated, and then deployed. So, I started by creating a fully automated development/staging/production devops environment, and moved the application to docker with a RESTful flask backend and Bootstrap frontend. After three months were up, they decided to let me keep working remote so long as I still came in a week a month. For the next year and a half, this tool grew and improved, gaining hundreds of users a day both in and out of the reliability team. It turned out that many features of my website ended up providing a better experience than the ones developed by SpaceX's core applications engineering team, at least for Starlink's use-case. This did not go unnoticed, and that team ended up reaching out to create an equivalent, and officially supported, first-party tool in the SpaceX application ecosystem. They even created a small team dedicated to its creation and maintenance. This also came at a perfect time, as I was losing my ability to keep up with my custom website as a solo developer. Oh, and did I mentioned I'd been supporting satellite hardware test remotely for the latter nine months at this time, as well?

Components Test Engineering

  In March of 2024, the team lead for components test software (the components team being the parent group of satellite hardware test by that point) reached out and asked me if I wanted to join a small two person infrastructure team. Considering my remote nature, this made a lot more sense than trying to continue supporting satellite hardware tests, and I was going to be able to provide a lot of experience with devops infrastructure and docker that could greatly benefit the team. I agreed, and began working on dockerization and unification of the team's tools into a monorepo. I also began supporting a relatively new tool for running test software that the team was developing, along with a small slew of miscellaneous tools such as those to image and provision servers over PXE.
  After dockerizing all the infrastructure and moving them into the monorepo, my next push was to improve velocity and reliability by adding automated build, test, and deploy tasks for new development, staging, and production environments across them all. I did this by combining a unified set of Makefile interfaces, with automated runs via ansible from pull requests and main branch merges. Application and server monitoring was also added via tools such as Sentry and Grafana, which were piped into OpsGenie so the infra team could be pinged during outages. Backups were also made of production databases during deployments, with restore tooling that could be used for emulating production data during staging, development, and local deploys. Overall, these changes greatly improved the speed at which the team could develop, and helped ensure that what we were deploying was validated as thoroughly as possible in advance.
  At the end of January 2025, I was notified by the company that my remote work exception would no longer be valid as of April 4th. I was now living in northern Washington state and close to a two hour drive, each way, from the closest SpaceX facility. I therefore opted to quit after roughly six years with them in total. Considering the average is two to three years, I'd say I did quite well, and am very proud of what I managed to achieve with the company in that time! If you'd like to know more, feel free to contact me, and I'm happy to share what I'm allowed! This summary was ultimately just the tip of the iceberg for an experience that made six years feel like twelve.