Hardware Test Engineer I/II
Space Exploration Technologies Corporation
Summary
Timeline
Started
Satellite Hardware Test Team
Owned test systems for four generations of Starlink flight computers and two generations of power boards
Transitioned To Remote
Moved To Oregon
Personal decision, but I was allowed to work on tools for the build reliability engineering team
Changed Teams
Components Test Infra Team
Vertical move that allowed for broader application of my skills
Finished
Thanks for all the fish!
Celebrated five and a half years of helping put thousands of satellites, and dozens of rockets, into orbit
Key Takeaways
- Created test systems which validated ~4500 Starlink satellite flight computers, and ~4000 power boards
- Developed program-critical infrastructure that enabled efficient triage, management, tracking, and metrics of hardware failures
- Designed and deployed automated, unified, and containerized infrastructure to greatly increase application reliability and development speed
- Provided on-call support for on-orbit hardware, satellite-test/components-test systems, and satellite-test/components-test infrastructure
Relevant Skills
Software & Environments
- Version Control
- Git
- Subversion
- Programming
- Languages
- Python 2/3
- Bash Shell Scripting
- High-Level Embedded C/C++ (Arduino/Teensy)
- HTML
- CSS
- Typescript/Javascript
- Databases
- Postgres
- Microsoft SQL
- MySQL
- DevOps
- Docker
- Ansible
- Kubernetes
- Languages
- Methodologies
- Test Driven Development
- Agile
- Operating Systems
- Linux
- Centos
- Ubuntu
- Microsoft Windows
- Linux
- Applications
- Atlassian
- Jira
- Bitbucket
- Confluence
- Monitoring/Metrics
- Grafana
- Sentry
- OpsGenie
- Secrets
- Bitwarden
- Hashicorp Vault
- Other
- Microsoft Office Suite
- Smartsheet
- Atlassian
Electrical
- Schematic & PCB Design
- Software
- Altium Designer
- Mentor Graphics PADS
- PCB Features
- Multi-layer (up to 16)
- Impedance Controlled Designs
- High Power Designs
- Power Simulation
- Manufacturing
- Gerber Export
- BOM Management
- Board-House Assembly
- In-House Assembly
- Software
- Electrical Diagnostics
- Multimeters
- Electronic Loads
- Oscilloscopes
- Logic Analyzers
- LCR Meters
- Harnessing Fabrication
- DC Low-Power & Signal
- DC High-Power
- Sub-Microwave RF
- Vacuum Rated Harnesses
- Vacuum Chamber Harness Passthroughs
- Test Rack Equipment
- Custom Fixture Design/Fabrication
- Single/Bi-Directional Power Supplies
- Electronic Loads
- NI Test Hardware
- Labjack
Mechanical
- Siemens Teamcenter PLM
- 3D Modeling
- Siemens NX
- Protocase Designer
- FreeCAD
- Fabrication
- Laser Cutting
- 3D Printing
- CNC
- Waterjet
- Hand Tools
Other
- Specialty Tests
- Highly Accelerated Life Testing
- Thermal Vacuum Chamber Testing
- Vibration Testing
Details
Starlink Test Engineering
Starting at SpaceX, I'd joined the hardware test engineering team for
the Starlink program in Redmond, WA. With my first day being in early
September of 2019, the V0.9 revision satellites had only just been
shot into space a few months prior. This meant my start coincided with
earnest preparations to launch V1.0, as well as ramping production to
rates previously thought impossible for space industries. My initial
task at the company was simple; offload the current hardware test
engineer for flight computer in any way possible, as he was
significantly overloaded.
I put my all into this task, immediately taking over quite a few
production code debug and improvement tasks, as well as running EMI
compliance testing, while absorbing as much as I could about the
product, its test systems, and test procedures. During this time, I
was extremely happy to have already done a test engineering internship with the company, as it made ramping up much faster and easier than it
would have been otherwise. Around three months after joining the team, ownership
of these test systems were transferred to me in their entirety. What followed
was nearly three years of the kind of intensity you'd expect out of a company
like SpaceX, but with the unique addition of higher production rates than
anywhere else in the company, or space industry.
When I say I took ownership of flight computer test systems, I mean
this very literally, and specifically a multi-disciplinary sense. I
handled high-level test rack designs, purchasing, test fixture
enclosure design, test fixture PCB design, test fixture firmware
development, test rack networking configuration, test rack server
configuration and tuning, DUT harness design, prototype DUT harness
fabrication, along with test software development to cover not only
production tests, but also more strenuous qualification tests which
often required their own custom rack design compared to their
production counterparts. This barely scratches the surface in reality,
and doesn't include the countless nights on the production floor,
personally running hundreds of boards through functional and
environmental testing, slowly and consistently improving the quality,
rate, and coverage of the tests. It also doesn't include the whole
year where I was on-call 24/7 for a full week, every other week,
triaging on-orbit alerts for flight computer, reviewing data, and
sending commands to recover where needed.
As time progressed, I began to burn out and requested transfer to a
different product. Flight computer was a particularly intense due to
having little-to-no overlap with the rest of the satellite
sub-systems, and being doubly critical functionality-wise to satellite
operation. I was given a new test system for power boards, but due to
unfortunate timing, the new owner for flight computer changed
companies and I ended up having to temporarily take over test rack
hardware design, alongside permanently handling the software for
flight computer through the end of this overall phase at SpaceX. This
turned out to be the most intense period of time at the company for
me, with some of the longest hours and latest nights that I'd rather
not share the durations of. However, as with all the time previously,
I successfully delivered multiple test system derivations for power
board, alongside all the software for multiple of the next generation
of flight computer.
Build Reliability Engineering
In early 2022, I decided I was going to move in with my
long-distance partner down in Oregon. SpaceX is not normally known
for allowing remote work, but I floated the idea a few months before
my August deadline for leaving, just in case. My good friend, and
head of the build reliability engineering team at the time, also
proposed a project for me to work on that would enable fast triage
and disposition of hardware failures coming out of test. I'd
previously taught him some basic web development to get a prototype
version of the app up and running, but it was fast approaching its
usability limits and needed professional support. They approved me
to work on this for three months, with the stipulation that I came
in person for one week a month. Since it was going to be short term,
I agreed and began working on this website!
This level of web development was rather new to me, at the time, as
I'd always said I'd never become a full time web developer since
before even starting college. Still, that wasn't going to stop me,
and I quickly began making improvements. One thing SpaceX had
already taught me was that development velocity was paramount, and
that there should be as little friction as possible between changes
getting made, becoming validated, and then deployed. So, I started
by creating a fully automated development/staging/production devops
environment, and moved the application to docker with a RESTful
flask backend and Bootstrap frontend. After three months were up,
they decided to let me keep working remote so long as I still came
in a week a month. For the next year and a half, this tool grew and
improved, gaining hundreds of users a day both in and out of the
reliability team. It turned out that many features of my website
ended up providing a better experience than the ones developed by
SpaceX's core applications engineering team, at least for Starlink's
use-case. This did not go unnoticed, and that team ended up reaching
out to create an equivalent, and officially supported, first-party
tool in the SpaceX application ecosystem. They even created a small
team dedicated to its creation and maintenance. This also came at a
perfect time, as I was losing my ability to keep up with my custom
website as a solo developer. Oh, and did I mentioned I'd been
supporting satellite hardware test remotely for the latter nine
months at this time, as well?
Components Test Engineering
In March of 2024, the team lead for components test software (the
components team being the parent group of satellite hardware test by
that point) reached out and asked me if I wanted to join a small two
person infrastructure team. Considering my remote nature, this made
a lot more sense than trying to continue supporting satellite
hardware tests, and I was going to be able to provide a lot of
experience with devops infrastructure and docker that could greatly
benefit the team. I agreed, and began working on dockerization and
unification of the team's tools into a monorepo. I also began
supporting a relatively new tool for running test software that the
team was developing, along with a small slew of miscellaneous tools
such as those to image and provision servers over PXE.
After dockerizing all the infrastructure and moving them into the
monorepo, my next push was to improve velocity and reliability by
adding automated build, test, and deploy tasks for new development,
staging, and production environments across them all. I did this by
combining a unified set of Makefile interfaces, with automated runs
via ansible from pull requests and main branch merges. Application
and server monitoring was also added via tools such as Sentry and
Grafana, which were piped into OpsGenie so the infra team could be
pinged during outages. Backups were also made of production
databases during deployments, with restore tooling that could be
used for emulating production data during staging, development, and
local deploys. Overall, these changes greatly improved the speed at
which the team could develop, and helped ensure that what we were
deploying was validated as thoroughly as possible in advance.
At the end of January 2025, I was notified by the company that my
remote work exception would no longer be valid as of April 4th. I
was now living in northern Washington state and close to a two hour
drive, each way, from the closest SpaceX facility. I therefore opted
to quit after roughly six years with them in total. Considering the
average is two to three years, I'd say I did quite well, and am very
proud of what I managed to achieve with the company in that time! If
you'd like to know more, feel free to contact me, and I'm happy to
share what I'm allowed! This summary was ultimately just the tip of
the iceberg for an experience that made six years feel like twelve.