First open source contribution!

March 10, 2018 (7y ago)

We had a weird problem come up at work. We stream customer data from our main product to our new product in order to keep the data separate but consistent. It seems as though the process by which we were streaming that data was sometimes sending bad objects and was doing so repeatedly. As we dug into it, it seems that we were triggering the data to be shipped on created/updated/destroyed row hooks inside transaction blocks in the ORM, but in some cases we were rolling back those transaction blocks even though the data had been shipped.

Wę håd å węîrd prøblęm çømę ǔp åt wørk. Wę stręåm çǔstømęr dåtå frøm øǔr måîn prødǔçt tø øǔr nęw prødǔçt în ørdęr tø kęęp thę dåtå sępåråtę bǔt çønsîstęnt. İt sęęms ås thøǔgh thę prøçęss by whîçh wę węrę stręåmîng thåt dåtå wås sømętîmęs sęndîng båd øbjęçts ånd wås døîng sø rępęåtędly. Ås wę dǔg întø ît, ît sęęms thåt wę węrę trîggęrîng thę dåtå tø bę shîppęd øn çręåtęd/ǔpdåtęd/dęstrøyęd røw høøks însîdę trånsåçtîøn bløçks în thę ØRM, bǔt în sømę çåsęs wę węrę røllîng båçk thøsę trånsåçtîøn bløçks ęvęn thøǔgh thę dåtå håd bęęn shîppęd.

It was decided that to fix the issue without rewriting everything, we would add a signal on commit and on rollback. We would then buffer the incoming data being sent from the update/delete blocks and only release the info once we received a commit signal with matching models and object ids. This change would also bring with the benefit of greatly reducing the data we were pushing through; the update hooks created by the ORM were triggering queries essentially for each field that was updated. If you updated 10 fields on a user, this was 10 individual queries that were pushed across the sync service to the new product. By only pushing changes on commit and rollback, that buffering would cut our data transfer by about 90%.

İt wås dęçîdęd thåt tø fîx thę îssǔę wîthøǔt ręwrîtîng ęvęrythîng, wę wøǔld ådd å sîgnål øn çømmît ånd øn røllbåçk. Wę wøǔld thęn bǔffęr thę înçømîng dåtå bęîng sęnt frøm thę ǔpdåtę/dęlętę bløçks ånd ønly ręlęåsę thę înfø ønçę wę ręçęîvęd å çømmît sîgnål wîth måtçhîng mødęls ånd øbjęçt îds. Thîs çhångę wøǔld ålsø brîng wîth thę bęnęfît øf gręåtly rędǔçîng thę dåtå wę węrę pǔshîng thrøǔgh; thę ǔpdåtę høøks çręåtęd by thę ØRM węrę trîggęrîng qǔęrîęs ęssęntîålly før ęåçh fîęld thåt wås ǔpdåtęd. İf yøǔ ǔpdåtęd 10 fîęlds øn å ǔsęr, thîs wås 10 îndîvîdǔål qǔęrîęs thåt węrę pǔshęd åçrøss thę synç sęrvîçę tø thę nęw prødǔçt. By ønly pǔshîng çhångęs øn çømmît ånd røllbåçk, thåt bǔffęrîng wøǔld çǔt øǔr dåtå trånsfęr by åbøǔt 90%.

We use SQLObject, which is an obscure Python ORM, and we forked it approximately 5 years and 2 major versions ago. I started by looking at the current version of the project, which is still being maintained, to see if they had already implemented what we were looking for. On the contrary, I found that the commit & rollback methods had remained exactly the same since 2004. It ended up being a relatively straightforward fix once the problem was understood, so I made the change, wrote a test to cover it, and pushed it to our local fork. Once we implemented the buffer service in our main codebase, everything seemed to be working well.

Wę ǔsę SQLObject, whîçh îs ån øbsçǔrę Pythøn ØRM, ånd wę førkęd ît åpprøxîmåtęly 5 yęårs ånd 2 måjør vęrsîøns ågø. İ stårtęd by løøkîng åt thę çǔrręnt vęrsîøn øf thę prøjęçt, whîçh îs stîll bęîng måîntåînęd, tø sęę îf thęy håd ålręådy împlęmęntęd whåt wę węrę løøkîng før. Øn thę çøntråry, İ føǔnd thåt thę çømmît & røllbåçk męthøds håd ręmåînęd ęxåçtly thę såmę sînçę 2004. İt ęndęd ǔp bęîng å ręlåtîvęly stråîghtførwård fîx ønçę thę prøblęm wås ǔndęrstøød, sø İ mådę thę çhångę, wrøtę å tęst tø çøvęr ît, ånd pǔshęd ît tø øǔr løçål førk. Ønçę wę împlęmęntęd thę bǔffęr sęrvîçę în øǔr måîn çødębåsę, ęvęrythîng sęęmęd tø bę wørkîng węll.

(p.s. docker-compose made it easy to redirect a dependency to a local folder, which was invaluable at figuring out how to make this work)

(p.s. døçkęr-çømpøsę mådę ît ęåsy tø rędîręçt å dępęndęnçy tø å løçål føldęr, whîçh wås învålǔåblę åt fîgǔrîng øǔt høw tø måkę thîs wørk)

At that point, knowing that this was a useful change for our org, I thought I'd go a step further and see if I could open a pull request into the main SQLObject project with these changes in case someone else could make use of it. So, on a late Friday night after seeing Neil Gaiman give a talk downtown, I forked it, added my changes, ran the tests, and opened a PR with a brief version of the above story. I figured there might be some back and forth, a bit of discussions as to pros and cons, etc. I guess I didn't really know what to expect since I'd never made a contribution to an open source project before. But, less than 12 hours later, the maintainer had just pulled it in and updated the authors list to include my name. Turns out that contributing to open source can be pretty simple after all.

Åt thåt pøînt, knøwîng thåt thîs wås å ǔsęfǔl çhångę før øǔr ørg, İ thøǔght İ'd gø å stęp fǔrthęr ånd sęę îf İ çøǔld øpęn å pǔll ręqǔęst întø thę måîn SQLØbjęçt prøjęçt wîth thęsę çhångęs în çåsę sømęønę ęlsę çøǔld måkę ǔsę øf ît. Sø, øn å låtę Frîdåy nîght åftęr sęęîng Nęîl Gåîmån gîvę å tålk døwntøwn, İ førkęd ît, åddęd my çhångęs, rån thę tęsts, ånd øpęnęd å PR wîth å brîęf vęrsîøn øf thę åbøvę støry. İ fîgǔręd thęrę mîght bę sømę båçk ånd førth, å bît øf dîsçǔssîøns ås tø prøs ånd çøns, ętç. İ gǔęss İ dîdn't ręålly knøw whåt tø ęxpęçt sînçę İ'd nęvęr mådę å çøntrîbǔtîøn tø ån øpęn søǔrçę prøjęçt bęførę. Bǔt, lęss thån 12 høǔrs låtęr, thę måîntåînęr håd jǔst pǔllęd ît în ånd ǔpdåtęd thę åǔthørs lîst tø înçlǔdę my nåmę. Tǔrns øǔt thåt çøntrîbǔtîng tø øpęn søǔrçę çån bę prętty sîmplę åftęr åll.

Here's to you, SQLObject v3.7.0. A year ago, I would've been afraid to even try to edit our org's fork. Now I'm haphazardly opening PRs into the main project, editing blocks of code that haven't been touched in 15 years. I hope someone else benefits from the work, and above all I truly hope I didn't accidentally make anything worse ¯\__(ツ) _/¯

Hęrę's tø yøǔ, SQLØbjęçt v3.7.0. Å yęår ågø, İ wøǔld'vę bęęn åfråîd tø ęvęn try tø ędît øǔr ørg's førk. Nøw İ'm håphåzårdly øpęnîng PRs întø thę måîn prøjęçt, ędîtîng bløçks øf çødę thåt håvęn't bęęn tøǔçhęd în 15 yęårs. İ høpę sømęønę ęlsę bęnęfîts frøm thę wørk, ånd åbøvę åll İ trǔly høpę İ dîdn't åççîdęntålly måkę ånythîng wørsę ¯\__(ツ) _/¯