We had a bad type coercion in the code to update container cluster
resource labels. This fixes the coercion.
We didn't notice this because there was no test exercising the update
code path. I've added the test, which reproduced the panic before this
PR was applied and passes successfully now that the coercion is fixed.
Images now have a licenses field, which lets users specify licenses to
use on an image. But official images already have licenses on them, adn
so Terraform is reading those as a diff. To preserve backwards
compatiblity and avoid a breaking change that would require all
`google_compute_image` users to update their configs, I've set the field
to computed. This means that if no licenses are set in the config, there
can be licenses in the config without prompting a diff. Obviously, this
isn't ideal, as it means you can't ever remove all the licenses from an
image, but I think the benefits here outweigh the drawbacks.
In testing an upcoming `google_compute_region_disk` resource, I had to make these changes. Checking them in separately so that when the magician runs, these changes will already be a part of TF.
Currently, the rolling-update API expects a PUT (regionInstanceGroupManager.update())
instead of a PATCH (regionInstanceGroupManager.patch()) call, so immediate rolling updates
(as specified with `update_strategy = "ROLLING_UPDATE"``) are never triggered:
```
[DEBUG]: ---[ REQUEST ]---------------------------------------
[DEBUG]: POST /compute/beta/projects/mycompany-myapp-staging/regions/europe-west3/instanceGroupManagers/myapp-server/setInstanceTemplate?alt=json HTTP/1.1
....
[DEBUG]: ---[ RESPONSE ]--------------------------------------
...
[DEBUG]: "warnings": [
[DEBUG]: {
[DEBUG]: "code": "FIELD_VALUE_OVERRIDEN",
[DEBUG]: "message": "Update policy type was set to OPPORTUNISTIC. Please use regionInstanceGroupManager.update() to preserve the policy."
[DEBUG]: }
```
refs:
https://cloud.google.com/compute/docs/reference/rest/beta/instanceGroupManagers/patchhttps://cloud.google.com/compute/docs/reference/rest/beta/instanceGroupManagers/updateFix#1506
This was done as its own resource as suggested in slack, since we don't have the option of making all fields Computed in google_compute_instance. There's precedent in the aws provider for this sort of thing (see ami_copy, ami_from_instance).
When I started working on this I assumed I could do it in the compute_instance resource and so I went ahead and reordered the schema to make it easier to work with in the future. Now it's not quite relevant, but I left it in as its own commit that can be looked at separately from the other changes.
Fixes#1582.
When using predefined storage ACLs, you'd get a permadiff, because the
role_entities list was computed, but was never set in state. So it would
be read as empty in the config, and not present in state, so Terraform
would want to pull it down and sync it. This is probably, technically
speaking, a bug in Terraform, but we can work around it by just setting
role_entities to an empty value on every read.
Hypothetically fixes#1643.
@thomasriley, are you able to patch this change into your provider to see if it fixed the problem? I haven't been able to get a working repo so I haven't verified the fix yet.
Move from using qa.test.com, a domain we don't own, to qa.tf-test.club,
a domain we do own, so the domain validation doesn't cause our tests to
fail anymore.
Add a CustomDiff function to storage bucket ACLs that will ignore a diff
if the config and state have the same role_entities, even if they're in
a different order.
Fixes#1525.
Commit 8f31fec introduced a bug for the 'service_account_key' resource
where it required a project be set either in the provider or in the
resource for 'service_account_key', but a project isn't required if the
service account is a service account fully qualified name or a service
account email.
This PR relaxes the requirement that a project needs to be set for the
'service_account_key' resource, 'service_account' datasource and
'service_account_key' datasource, but will error if we try to build a
fully qualified name from a service account id when no project can be
found.
This also cleans up 'serviceAccountFQN' so it is slightly easier to
follow and return an error if there is no project but we need one to
build the service account fully qualified name.
Fixes: #1655
Added node config 'disk_type' which can either be 'pd-standard' or
'pd-ssd', if left blank 'pd-standard' will be the default used by google
cloud.
Closes: #1656
## What
As well as https://github.com/terraform-providers/terraform-provider-google/pull/1282 , make `resource_container_node_pool` importer accept `{project}/{zone}/{cluster}/{name}` format to specify the project where the node pool belongs to actually.
## Why
Sometimes I want to import container pool in different project from default SA's. However, currently there is no way to specify project the target node pool belongs to, Terraform tries to retrieve node pool from SA's project, then it fails due to `You cannot import non-existent resources using Terraform import.` error.
As discussed in #1326, we're not going to remove name_prefix for
compute_ssl_certificate, because it makes the common use case more
ergonomic by a good amount, and the only cost is it's harder to maintain
the autogenerated code, and we've decided the benefits outweigh the
costs in this circumstance.
* Allow using in repo configuration for cloudbuild trigger
Cloudbuild triggers have a complex configuration that can be defined
from the API. When using the console, the more typical way of doing this
is to defined the configuration within the repository and point the
configuration to the file that defines the config.
This can be supported by sending the filename parameter instead of the
build parameter, however only one can be sent.
* Acceptance testing for cloudbuild trigger with filename
Ensure that when a cloudbuild repo trigger is created with a filename,
that filename is what actually ends up in the cloud.
* Don't specify "by default" in cloudbuild-trigger.
The docs shouldn't say that "cloudbuild.yaml" is used by default. There
is no default from the APIs, but the console suggest using this value.
Just say it's the typical value in documentation.
No longer ForceNew when adding an App Engine application to a project,
when modifying the auth domain, modifying the serving status, or
modifying the feature settings.
* Use errwrap to retain original error
* Use built-in Page function, only return names when listing services
This removes the custom logic on pagination and uses the built-in Page function in the SDK to make things a bit simpler. Additionally, I added a field filter to only return service names, which drastically reduces the size of the API call (important for slow connections, given how frequently this function is executed).
Also added errwrap to better trace where errors originate.
* Add helper function for diffing string slices
This just looked really nasty inline
* Batch 20 services at a time, handle precondition failed, better errwrap
This commit does three things:
1. It batches services to be enabled 20 at a time. The API fails if you try to enable more than 20 services, and this is documented in the SDK and API. I learned this the hard way. I think Terraform should "do the right thing" here and batch them in series' of twenty, which is what this does. Each batch is tried in serial, but I think making it parallelized is not worth the complexity tradeoffs.
2. Handle the precondition failed error that occurs randomly. This just started happened, but it affects at least two APIs consistently, and a rudimentary test showed that it failed 78% of the time (78/100 times in an hour). We should fix this upstream, but that failure rate also necessitates (in my opinion) some mitigation on the Terraform side until a fix is in place at the API level.
3. Use errwrap on errors for better tracing. It was really difficult to trace exactly which error was being throw. That's fixed.
* Updates from code review
* Add basic update for `google_kms_crypto_key` resource
Prior to this commit, any changes to `rotation_period` would
force a new resource as no `Update` was defined for the resource.
This commit introduces a basic `Update` through calling the
`Patch` service method. It only modifies the `rotation_period`,
and `next_rotation_time` at the moment, but this is reflective
of what is "allowed" on https://console.cloud.google.com/security/kms.
* Remove unused `Purpose` value in `CryptoKey`
We are only patching the `rotation_period`, and `next_rotation_time`,
so that value will not be affected.
* nit: format `Patch` operation to be in a single line
* Extend `TestAccKmsCryptoKey_rotation` test steps
- Test change in rotation period
- Test removal of rotation period
* Do not parse `NextRotationTime` if it is not set
* remove ForceNew: false
When creating a trigger by using the project defined in the schema we
enforce that the repo must be in that same project. We should be looking
at the project defined in the trigger_template data and falling back to
that first project if not found.
Closes: #1555
When enabling services, after the waiter returns, list the enabled
services and ensure the ones we enabled are in there. If not, retry. May
not always resolve#1393, but should help. Unfortunately, the real
answer is probably either:
1. For us to try and get the API updated to only return the waiter when
the service will consistently be available. I don't know how feasible
this is, but I'm willing to open a ticket.
2. For us to build retries into ~all our resources to retry for a set
amount of time when a service not enabled error is returned. This would
greatly slow down the provider in the case of the service legitimately
not being enabled, but is how other providers handle this class of
problem.
Unfortunately, due to the eventual consistency at play, this is a hard
issue to reproduce and prove, though it matches with my
experience--while testing this patch, one of the tests failed with the
error that the serviceusage API hadn't been enabled, but only on step 4
of the test, when calls had already succeeded. Which suggests eventual
consistency, to me. Regardless, this patch shouldn't _hurt_ and should
mostly be an imperceptible change to users, and should make instances
like #1393 less likely.
* vendor service usage api
* use serviceusage api instead of servicemanagement for project services
* add bigquery-json to test
* add import for project service
* add serviceusage_operation.go
When reading a project, both App Engine and Billing would fail if
_neither_ the default project the provider was configured with nor the
project being targeted had the App Engine Admin or Billing APIs
(respectively) enabled. We didn't catch this because our source project
obviously has both enabled.
This fixes the matter by checking the error returned from each of those,
and if it looks like an API not enabled error, logging it at warning
level instead of returning it as an error, which will let the read
proceed as usual.
IAP has no reasonable support policy, because PATCH is broken, and IAP
must be configured with an OAuth2 client ID and secret that belongs to
the project the app is associated with. There's no programmatic way to
create Clients. But we create the project and the app at the same time,
and we can't update because PATCH is broken. So this just drops IAP. It
also forces all our updates to ForceNew, because we can't update.
Also, adds more test coverage and docs, and fixes import by not relying
on the config for setting app engine info in state.
* Revert "Merge pull request #1434 from terraform-providers/paddy_revert_beta"
This reverts commit 118cd71201, reversing
changes made to d59fcbbc59.
* add ConvertSelfLinkToV1 calls to places where beta links are stored
Fix a panic in our test that is caused by a ListPolicy being nil. I
assume, but cannot verify, that this is an API change in that it may now
send back a nil listpolicy if a default is used.
Add the `enable_flow_logs` field to our subnetwork resource, so we can
specify whether [flow logs][1] should be enabled in Terraform configs.
Note that this behavior isn't explicitly documented yet, but it has made
it into the beta API client.
[1]: https://cloud.google.com/vpc/docs/using-flow-logs
This PR also switched us to using the beta API in all cases, and that had a side effect which is worth noting, note included here for posterity.
=====
The problem is, we add a GPU, and as per the docs, GKE adds a taint to
the node pool saying "don't schedule here unless you tolerate GPUs",
which is pretty sensible.
Terraform doesn't know about that, because it didn't ask for the taint
to be added. So after apply, on refresh, it sees the state of the world
(1 taint) and the state of the config (0 taints) and wants to set the
world equal to the config. This introduces a diff, which makes the test
fail - tests fail if there's a diff after they run.
Taints are a beta feature, though. :) And since the config doesn't
contain any taints, terraform didn't see any beta features in that node
pool ... so it used to send the request to the v1 API. And since the v1
API didn't return anything about taints (since they're a beta feature),
terraform happily checked the state of the world (0 taints I know about)
vs the config (0 taints), and all was well.
This PR makes every node pool refresh request hit the beta API. So now
terraform finds out about the taints (which were always there) and the
test fails (which it always should have done).
The solution is probably to write a little bit of code which suppresses
the report of the diff of any taint with value 'nvidia.com/gpu', but
only if GPUs are enabled. I think that's something that can be done.
This PR does a few things to the User-Agent header:
1. It puts Terraform/(version) first, since that's the way the RFC
expects it
2. It removes the goos and goarch data, although I could be convinced to
put it back in, I don't see what value it's providing
3. Moves directly to consuming the version package (which is the comment
above the function previously being called)
This simply adds the specification for operation timeouts, and sets
sane defaults. In testing against specific regions, creation of SQL
databases would fluctuate between 7-14 minutes against us-east1. As
such, a 15m creation threshold is recommended for this. Update and
Delete operations will adhere to 10m timeouts.
This follows a similar format as #1309.
* escape the folder name (in case of spaces, etc)
* add test case for folder with space
* add missing args
* make separate tests for each folder test, get folder name length under API limits
* further abstract out the resource name to prevent test collisions
* workaround multiple results returning for a given query by looping over return
* split test cases into separate funcs
* adding google folder data source with get by id, search by fields and lookup organization functionality
* removing search functionality
* creating folders for each test and updating documentation with default values
* Add support for regional GKE clusters in google_container_cluster:
* implement operation wait for v1beta1 api
* implement container clusters get for regional clusters
* implement container clusters delete for regional cluster
* implement container clusters update for regional cluster
* simplify logic by using generic 'location' instead of 'zone' and 'region'
* implement a method to generate the update function and refactor
* rebase and fix
* reorder container_operation fns
* cleanup
* add import support and docs
* additional locations cleanup
* Updates the default GKE legacy ABAC setting to false
* Updates docs for container_cluster
* Update test comments
* Format fix
* Adds ImportState test step to default legacy ABAC test