Slzzp Tech Note: 6月 2021

這是一個用 terraform 處理 GCP project / pubsub subscription / pubsub topic / storage bucket / etc 的 IAM role 常會遇到的 race condition, 目前高達八成確定原因是在 GCP 的相關 API 不是 single action, 而是 atomic action, 而且~~設後不理~~沒確認是否執行完成就直接 return.

目前只能用 workaround 解法: 再執行/多執行幾次 terraform 指令. (原因後敘)

假設狀況如下: (真實狀況可能不是/不只這樣)
- bucket foobar 原本就有 role: roles/storage.legacyBucketReader
- 現在要改成 role: roles/storage.legacyBucketOwner
- 執行 terraform apply (無關的部分就省略了):

  Terraform used the selected providers to generate the following execution plan. Resource actions are
  indicated with the following symbols:
  -/+ destroy and then create replacement

  Terraform will perform the following actions:

  # module.basement.google_storage_bucket_iam_member.bucket must be replaced
  -/+ resource "google_storage_bucket_iam_member" "bucket" {
      ~ role   = "roles/storage.legacyBucketReader" -> "roles/storage.legacyBucketOwner" # forces replacement
    }

  Plan: 1 to add, 0 to change, 1 to destroy.

  Do you want to perform these actions?
    Terraform will perform the actions described above.
    Only 'yes' will be accepted to approve.

  Enter a value:

  表示這個 bucket 的 iam role 會被先拆後建.
- 輸入 yes 再按 enter 下去之後, 噴錯誤訊息出來說無法設定之類的.... (省略)
- 通常只要再執行一次 terraform apply 再 yes 下去之後就可以正常執行完.
- 還是噴一樣錯誤訊息的話, 那就再等一下再執行一次...
- The End.

有人會問, 是不是可以用 time_sleep 的寫法讓 destroy 先執行完再 apply ?

首先, 這是一個 resource 被 replace (destroy -> apply) 的動作, 並不是分開的 resource 運作, 所以不適用上面這種方式來處理.

其次, 在 terraform 裡面這應該是個呼叫 GCP API 進行 remove 之後再 add 的行為, 中間沒有也不應該有 delay 動作影響執行效率(這也可能產生別的 race condition), 問題點是在 remove 跟 add 大概是 atomic action, 沒有全部確定執行完就 return 回來, 產生後續的 race condition 問題.

最後, 利用 time_sleep 那個寫法實在是累贅也有問題, 因為使用者通常不會知道 create 要花多久時間(create_duration), 也不會知道 destroy_duration 多久, 只能用預估或是猜的來設定. 若是 GCP 不忙的時候可能 1s 就全部跑完, 卻還要等完剩下的 29s, 或是 1m 才跑完, 設定 30s 照樣還是發生 race condition.

倒不如還是人工 delay 再人工執行同樣的 terraform 指令還比較簡單實用.

Slzzp Tech Note

2021年6月21日星期一

Terraform - GCP IAM apply / destroy race condition

2021年6月21日 星期一

Terraform - GCP IAM apply / destroy race condition

2021年6月21日星期一