/ archive / Proposal:_Use_Trunk-Based_Development.md
Proposal:_Use_Trunk-Based_Development.md
  1  # <span class="c2">Proposal: use trunk-based development</span>
  2  
  3  <span class="c0">Written by Oskar Thorén on September 16, 2017. Updated
  4  September 23.</span>
  5  
  6  <span class="c0"></span>
  7  
  8  <span class="c0">We should practice trunk-based development in
  9  status-react and status-go repos.</span>
 10  
 11  <span class="c0"></span>
 12  
 13  <span class="c0">Trunk-based development (TBD) is a source-control
 14  branching model, where developers collaborate on code in a single branch
 15  called ‘trunk’ (‘develop’ for us), resist any pressure to create other
 16  long-lived development branches by employing documented techniques,
 17  avoid merge hell, do not break the build, and live happily ever
 18  after.</span>
 19  
 20  ## <span class="c11 c12">Why?</span>
 21  
 22  <span class="c0">1) We have a lot of issues related to breaking builds,
 23  long-lived branches that other branches depend on, wasting precious time
 24  on coordinating and doing manual builds from ad hoc branches to patch
 25  things up.</span>
 26  
 27  <span class="c0"></span>
 28  
 29  <span class="c0">A breaking build on 'develop' (i.e. change that causes
 30  the app to stop working in a major way) should be flaming emergency and
 31  you stop everything to fix it, even if it means reverting
 32  commits.</span>
 33  
 34  <span class="c0"></span>
 35  
 36  <span class="c0">2) We already have a great QA team, a PR based
 37  workflow, a (semi-)-functioning CI, and feature flags. All of this is a
 38  perfect fit for TBD.</span>
 39  
 40  <span class="c0"></span>
 41  
 42  <span class="c4">Objection: We merged a few bugs into develop with
 43  changes related to wallet. If we use TBD such problems will increase. If
 44  we merge want this not to happen we have to not merge stuff into develop
 45  before merging, but this makes fast reviews (tens of minutes) not
 46  possible.</span>
 47  
 48  <span class="c4"></span>
 49  
 50  <span class="c4">Reply: It is OK to push a few bugs under WIP if nothing
 51  breaking like crazy. If it's a big/UX related change, QA can note known
 52  regressions/limitations of things hidden behind flag. Once flag is about
 53  to be turned on for release, you do more extensive checking. You don't
 54  need auto testing for this.</span>
 55  
 56  <span class="c0"></span>
 57  
 58  <span class="c0">3) TBD encourages shipping code, breaking things into
 59  smaller parts and getting things into trunk/develop as soon as
 60  possible.</span>
 61  
 62  <span class="c0"></span>
 63  
 64  <span class="c4">Objection: Commit every 24h isn't a good policy because
 65  it doesn't guarantee meaningful smaller parts.</span>
 66  
 67  <span class="c4"></span>
 68  
 69  <span class="c10">Reply: You don't need "commit every 24h", just if you
 70  open a PR and it hasn't been merged in two days it is most likely (a)
 71  not ready yet and/or (b) too big. Minimal utility can be for other
 72  developers as
 73  well.</span>
 74  
 75  ## <span class="c11 c12">I thought we kind of did this already; examples?</span>
 76  
 77  <span class="c0">- \`status-react\` has never depended on \`status-go\`
 78  develop branch as far as I can tell. This, in addition to lack of CI for
 79  status-go, leads to cascading  effects for status-react.</span>
 80  
 81  <span class="c0"></span>
 82  
 83  <span class="c4">Objection: I don't see a big problem in this, most
 84  cascading effects come from go-ethereum, which we don't control.</span>
 85  
 86  <span class="c4"></span>
 87  
 88  <span class="c4">Reply: It (status-go HEAD breaking for a month) can be
 89  solved by only actually doing the upgrade when all breaking changes have
 90  been incorporated. This too can be a flag and one commit. We have wasted
 91  a lot of time on this status-react if you sum it all up I think, and
 92  devs can't use bugfixes like crashing in debugger.</span>
 93  
 94  <span class="c4"></span>
 95  
 96  <span class="c4">Counter-reply: This will require a lot of coordination
 97  on status-react and status-go side.</span>
 98  
 99  <span class="c4"></span>
100  
101  <span class="c4">What if we have this situation like we have with
102  \`upgrade-status-go\`, when we can’t merge it because app will be fucked
103  (now eth.sendTransaction calls are blocking), but you need some changes
104  from this branch (“like crashing in debugger”, etc). TBD definitely will
105  not work if we will use it on our side here, i doubt that it will be
106  possible to choose dependency by flag (and even if so, fix for crushing
107  debugger will not be available), so then flags should be</span>
108  
109  <span class="c4">used on status-go side in order to achieve this. So we
110  still will have a loop between status-react and status-go: they have to
111  add flag (if possible), only then we can proceed.</span>
112  
113  <span class="c4"></span>
114  
115  <span class="c4">And then when it comes to using flags for bugs which
116  appears after they rebase on new version of go-ethereum and something is
117  not working, i hardly imagine how this is doable. I mean flag which will
118  choose between pieces of code before and after rebasing, it can be
119  impossible to separate these pieces at all.</span>
120  
121  <span class="c4"></span>
122  
123  <span class="c4">Counter-counter-reply: We can use branch by abstraction
124  as well <https://trunkbaseddevelopment.com/branch-by-abstraction/> and
125  possibly have the two in parallel. I guess it's a matter of extensive QA
126  on the upgrade-go-ethereum short lived branch and push bugfixes _to
127  trunk_. Google does TBD and has a single repo, and I'm guessing they
128  have worse dependency issues than we do, so it's not impossible. Then
129  it's a question whether that is _worth it_ and maybe we should have a
130  basic post mortem on what went wrong and how we can avoid this in
131  future. But it's certainly doable if we think it is worth it. That
132  doesn't mean it is definitely worth it in every possible circumstance
133  for our org/code base etc, but we should make trade-off explicit.</span>
134  
135  <span class="c0"></span>
136  
137  <span class="c0">- Long-lived PRs: 5+ open PRs that are older than a
138  week in both repos.</span>
139  
140  <span class="c0"></span>
141  
142  <span class="c4">Objection: On the other hand we keep not finished, not
143  completely meaningful piece of code outside codebase. Splitting things
144  in meaningful parts is not always straightforward.</span>
145  
146  <span class="c4"></span>
147  
148  <span class="c4">Reply: It's OK to keep WIP work in code base, it makes
149  it easier for someone to take over too. Of course it should always have
150  some utility to someone, not just random stuff. Utility can be for other
151  developers (yourself in future or other team members) too.</span>
152  
153  <span class="c0"></span>
154  
155  <span class="c0">- Long-lived branches: We use ‘epic/wallet’ when QA
156  flow and flag are already in place.</span>
157  
158  <span class="c0"></span>
159  
160  <span class="c4">Objection: We don't want WIP commits to go into develop
161  before QA, so we use epic branch to decrease number of testing
162  iterations.</span>
163  
164  <span class="c4"></span>
165  
166  <span class="c4">Reply: We can decrease number of testing iterations
167  just using flags. This is the exact same thing for QA especially if
168  everyone is running trunk and notice fatal errors immediately. The exact
169  involvement of QA for each WIP commit is up to QA and each PR.</span>
170  
171  <span class="c0"></span>
172  
173  <span class="c0">This is not any individual’s fault; it is a systemic
174  issue stemming from a lack of coherent vision and application of how we
175  do branching. Will there be issues? Sure, but nothing that can’t be
176  solved with disciplined application of known practices.</span>
177  
178  ## <span class="c11 c12">What would this mean in practice for us?</span>
179  
180  <span class="c0">- Ensure that \`develop\` is always working (tools we
181  have: CI, flags, incremental changes).</span>
182  
183  <span class="c0"></span>
184  
185  <span class="c4">Objection: Develop is fine, we also will need to be
186  able to build different versions of develop afaiu (based on
187  flags).</span>
188  
189  <span class="c4"></span>
190  
191  <span class="c4">Reply: Possibly. To start with all flags should be on
192  by default for nightlies, then for release builds almost all should be
193  off (depends on scope of release).</span>
194  
195  <span class="c0"></span>
196  
197  <span class="c0">- Be aggressive about not allowing long-lived branches
198  or PRs or weird branch dependencies.</span>
199  
200  <span class="c0"></span>
201  
202  <span class="c4">Comment on weird dependencies: we have several PRs
203  waiting for status-go upgrade, the PN one being biggest offender since
204  it is against a PR that was first opened a month ago and it requires a
205  lot of coordination despite not being directly tied to it, if breaking
206  things were hidden behind flag in status-go then this commit could've
207  happened anytime and PN branch would have debugger fix and PN public API
208  available to consume.</span>
209  
210  <span class="c0"></span>
211  
212  <span class="c0">- For releases, don’t do code freeze, instead
213  short-lived release branches + cherry pick from develop.</span>
214  
215  <span class="c0"></span>
216  
217  <span class="c4">Comment on releases: This should only be the concern of
218  the people actively working on creating release artifact, testing and
219  fixing regression. It has little impact on other devs WIP on develop.
220  </span>
221  
222  <span class="c0"></span>
223  
224  <span class="c15">For more details on TBD, read here:
225  </span><span class="c13 c15">[<https://trunkbaseddevelopment.com/>](https://www.google.com/url?q=https://trunkbaseddevelopment.com/&sa=D&ust=1507366105390000&usg=AFQjCNEm3ZdxQRxb4P3FHLexY1Corp8oPw)</span>
226  
227  ## <span class="c11 c12"></span>
228  
229  <span class="c0"></span>
230  
231  -----
232  
233  ## <span class="c11 c12"></span>
234  
235  ## <span class="c11 c12">General comments</span>
236  
237  <span class="c4">Comment: I see serious advantages in this approach,
238  because actually we have suffered from epic branches. Though still some
239  drawbacks \[captured in objections\]</span>
240  
241  <span class="c4"></span>
242  
243  <span class="c11 c7">Objection: Potential flag-based mess in
244  code.</span>
245  
246  <span class="c4"></span>
247  
248  <span class="c4">Reply: Yes we should be aggressive in deleting this,
249  can put 'best before' dates on branches - this requires some discipline
250  for sure. Also should get rid of flags as soon as they are in
251  release.</span>
252  
253  <span class="c4"></span>
254  
255  <span class="c11 c7">Objection: Resolving conflicts, this thing can’t
256  disappear.</span>
257  
258  <span class="c4"></span>
259  
260  <span class="c4">Reply: Why? with short lived PRs everyone will be more
261  up to date to trunk-state of the world, the distance between developers
262  will be shorter and deltas smaller. See for example that huge
263  refactoring PR, and a few things that depend on it and have either been
264  merged beforehand or after, both requiring extra conflict resolution
265  because of lack of shared work.</span>
266  
267  <span class="c4"></span>
268  
269  <span class="c7 c11">Objection: Coordination between status-go and
270  status-react.</span>
271  
272  <span class="c4"></span>
273  
274  <span class="c4">Reply: maybe but I'm not sure this is true if we do it
275  right (perhaps it is too expensive to do it right). In Rich Hickey's
276  words, "I don't want to know" - status-react should be able to depend on
277  status-go develop/HEAD (possibly last 'good commit' if bug is
278  introduced) and then new _capabilities_ are introduced in status-go.
279  Things being breaking changes can exists concurrently and then be
280  switched off, like deprecation notice kind of. I really like Rich
281  Hickey's philosophy of 'change' not being a useful word, either you
282  accrete, add to something and _provide_ more, or you break something
283  by introducing a bug or _requiring_ more. I think this lesson can be
284  applied outside of a Clojure ecosystem, even with crazy stuff like
285  go-ethereum vendoring, but it requires discipline</span>
286  
287  <span class="c4"></span>
288  
289  <span class="c4">There is a general point though that you hint at: a lot
290  of complexity status-go side are things I'm blissfully unaware of, so
291  the trade-off discussion there is better had there. We can still do
292  things on status-react side, though.</span>
293  
294  <span class="c4"></span>
295  
296  <span class="c7">Objection: Also we will need to redefine manual testing
297  process,</span><span class="c4"> so imagine task like wallet. If we add
298  commits with not finished features to trunk, we will add bugs for sure
299  (especially if new changes touch older code), so we still need to test
300  wallet related stuff periodically, we can’t wait for 100%
301  implementation. So then QA folks should be able to build develop with
302  flags they choose. It should be relatively easily doable with jenkins.
303  But then when it comes to flags existing in both status-react and
304  status-go, or just in status-go, it means that they have to build
305  status-go first with these flags and then link it with status-react,
306  only then build app. So we need to provide such tools. Also the list of
307  flags should be defined somewhere and well explained.</span>
308  
309  <span class="c4"></span>
310  
311  <span class="c4">Reply: QA can already test WIP wallet stuff with
312  Jenkins PR, no? It is true we might want something like develop with WIP
313  and without WIP, at least as we get closer to release. I don't know
314  about flags in status-go or why this would directly impact status-react,
315  but maybe I'm missing something. Agreed regarding list of being well
316  defined.</span>
317  
318  <span class="c4"></span>
319  
320  <span class="c4">Concrete example from stuff I'm working on now (PR:
321  <https://github.com/status-im/status-react/pull/1872>) in terms of
322  splitting work: I can submit PR that targets develop, doesn't have iOS
323  bindings/ mock call \`Notify\` and prints fcm-token and payload. Then
324  tester instruction is similar but deliverable is: add contact and see
325  their fcm-token in Testfairy logs, download logs and do \`grep
326  XXX\`.</span>
327  
328  <span class="c4"></span>
329  
330  <span class="c4">Then next PR is dependent on status-go public API
331  changes, ideally this would be available for consumption straightaway
332  when API is updated in trunk but if not, wait until this is provided by
333  status-go and then submit iOS/Java bindings. Possibly short-lived
334  minimal branch for QA to test flow that binds to specific unsupported
335  status-go branch, but this is where you maybe get some cascading
336  effects.</span>
337  
338  <span class="c4"></span>
339  
340  <span class="c4">Counter-reply:  We either have to always have already
341  compiled all invariants of flags for status-go (if don’t use only one
342  WIP flag), or to compile status-go as a step of status-react’s
343  build.</span>
344  
345  <span class="c4"></span>
346  
347  <span class="c4">The first case: There is some new feature in
348  status-react, like in that \`upgrade-status-go\` branch. Branch contains
349  changes related to new Whisper v5 api, and we definitely can abstract
350  this behind some protocol in cljs. So, we do this, test it with
351  status-go and find out that there is a breaking bug on status-go side.
352  We would commit this code to status-react’s trunk (it makes sense in
353  TBD, feature is not finished but we hide it anyway and don’t wait too
354  long for committing changes), but we can’t hide status-go related stuff
355  using the same flag, we need to add another flag in status-go first
356  which will hide new Whisper v5 api and provide v2 instead.</span>
357  
358  <span class="c4"></span>
359  
360  <span class="c4">The second case is similar, but related to that
361  go-ethereum rebasing thing: what if breaking change can’t be hidden by
362  flag at all. So we have short living branch in status-react, and we can
363  hide that feature but not status-go’s version on which it
364  depends.</span>
365  
366  <span class="c4"></span>
367  
368  <span class="c7">Objection: Seems obvious, the only reason we have
369  long-lived branches is due to being a new
370  team.</span><span class="c4"> I even wrote something like that before:
371  <https://hackmd.io/s/rkgqzRWIb>. I'm currently talking about status-go
372  as I've never had a chance to work to status-react to some
373  representative degree. I mean all our developers are middle to strong
374  but the team is weak just because we've just started working together on
375  the project nobody knows. This makes us do weird things and keep
376  long-lived branches.</span>
377  
378  <span class="c4"></span>
379  
380  <span class="c4">Reply: It might seem obvious yes, but we are not doing
381  it in practice. First step is getting everyone on the same page, I
382  think.</span>
383  
384  <span class="c4"></span>
385  
386  <span class="c4">Counter-reply: I thought everybody's already been on
387  the same page as I've already shared that document. The only thing
388  missing now is more aggressive approach to this requirement.</span>
389  
390  <span class="c4"></span>
391  
392  <span class="c7">Objection: Changes to stabilise everything shouldn't be
393  made at once,</span><span class="c4"> it's more flowing and gradual
394  process. Currently, everybody is occupied with the release and sees
395  broken and unreliable tests. Also, refactoring has come into play
396  recently. Once we have room to think about something else, we'll start
397  doing it. For example, everybody sees a lot of long-lived pull requests
398  and everybody understands that it shouldn't be so but we already have a
399  lot of more actual problems to tackle. From the status-go side we
400  already have more important things to tackle right now.</span>
401  
402  <span class="c4"></span>
403  
404  <span class="c4">Reply: As soon as release is out, I don't see why
405  status-react can't just depend on status-go develop straight away (even
406  if just a fixed commit). What downside would that have?</span>
407  
408  <span class="c4"></span>
409  
410  <span class="c4">Counter-reply: It can. status-react should just say: we
411  now depend on \`develop\`. Please, guys, keep it stable. This
412  requirement for us is already a good start.</span>
413  
414  <span class="c4"></span>
415  
416  <span class="c7">Objection: Let's start from status-react. Because these
417  problems are not so critical now for
418  status-go.</span><span class="c4"> They'll get more priority once we
419  have proper tests in place and the requirement from status-react to keep
420  status-go \`develop\` stable. Can open issue.</span>
421  
422  <span class="c4"></span>
423  
424  <span class="c10">Reply: OK.
425  </span><span class="c10 c13">[<https://github.com/status-im/status-go/issues/356>](https://www.google.com/url?q=https://github.com/status-im/status-go/issues/356&sa=D&ust=1507366105396000&usg=AFQjCNGtT8SN6Cm3tvdfddr3L6cYDSv3eA)</span><span class="c10"> 
426  Then I guess the other thing is about long lived PRs, but that's more
427  team-specific.</span>