Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce provider binary on-disk size #4383

Open
t0yv0 opened this issue Aug 19, 2024 · 6 comments
Open

Reduce provider binary on-disk size #4383

t0yv0 opened this issue Aug 19, 2024 · 6 comments
Labels
kind/engineering Work that is not visible to an external user

Comments

@t0yv0
Copy link
Member

t0yv0 commented Aug 19, 2024

Consider looking at options to reduce provider on-disk size.

Per a customer comment: it grow from 5.16 (~400MB) to 6.49 (~800MB) unpacked on disk.

Benefits of a leaner on-disk provider:

  • faster download times
  • less pressure on CI environment to have sufficient disk space

Possible culprits here:

  • embedded schema.json including more resources and examples for said resources, in more languages such as Java; could it be feasible to strip descriptions or at least examples from the schema distributed in the binary?

  • more embedded provider metadata, is there any compression that can be applied?

  • more Go dependencies statically linked in, anything that is possible to prune?

@pulumi-bot pulumi-bot added the needs-triage Needs attention from the triage team label Aug 19, 2024
@t0yv0
Copy link
Member Author

t0yv0 commented Aug 19, 2024

Some information. Most of the size is present in the upstream provider build:

du -sh terraform-provider-aws                                                                                                                                                                     ~/code/terraform-provider-aws
752M    terraform-provider-aws

@t0yv0
Copy link
Member Author

t0yv0 commented Aug 19, 2024

From https://github.com/t0yv0/gobuildsize report on the upstream provider, major contributing packages are:

github.com/aws/aws-sdk-go-v2/service/ec2 137603662
github.com/hashicorp/terraform-provider-aws/internal/service/lexv2models 90502114
github.com/hashicorp/terraform-provider-aws/internal/service/batch 65336638
github.com/hashicorp/terraform-provider-aws/internal/service/ec2 60966940
github.com/hashicorp/terraform-provider-aws/internal/service/bedrockagent 59476792
github.com/aws/aws-sdk-go/service/sagemaker 56084368
github.com/aws/aws-sdk-go/service/quicksight 49658874
github.com/aws/aws-sdk-go-v2/service/iot 49391552
github.com/aws/aws-sdk-go-v2/service/glue 48909028
github.com/hashicorp/terraform-provider-aws/internal/service/securitylake 44860032
github.com/hashicorp/terraform-provider-aws/internal/service/cognitoidp 44200606
github.com/hashicorp/terraform-provider-aws/internal/service/securityhub 43308668
github.com/aws/aws-sdk-go-v2/service/rds 42986552
github.com/hashicorp/terraform-provider-aws/internal/service/verifiedpermissions 42005986
github.com/hashicorp/terraform-provider-aws/internal/service/appfabric 40250200
github.com/hashicorp/terraform-provider-aws/internal/service/rekognition 39744438
github.com/hashicorp/terraform-provider-aws/internal/service/ssmcontacts 39644536
github.com/hashicorp/terraform-provider-aws/internal/service/bedrock 38030686
github.com/hashicorp/terraform-provider-aws/internal/service/networkfirewall 37901482
github.com/hashicorp/terraform-provider-aws/internal/service/lakeformation 37305190
github.com/hashicorp/terraform-provider-aws/internal/service/medialive 37158322
github.com/hashicorp/terraform-provider-aws/internal/service/devopsguru 36337252
github.com/aws/aws-sdk-go-v2/service/chime 36253706
github.com/hashicorp/terraform-provider-aws/internal/service/cloudfront 35393416
github.com/hashicorp/terraform-provider-aws/internal/service/rds 34928184
github.com/aws/aws-sdk-go/service/connect 34686812
github.com/hashicorp/terraform-provider-aws/internal/service/elasticache 34474934
github.com/hashicorp/terraform-provider-aws/internal/service/s3 33639018
github.com/hashicorp/terraform-provider-aws/internal/service/bcmdataexports 33586502
github.com/aws/aws-sdk-go-v2/service/ssm 33467788
github.com/hashicorp/terraform-provider-aws/internal/service/timestreamwrite 33138974
github.com/aws/aws-sdk-go-v2/service/redshift 32826864
github.com/hashicorp/terraform-provider-aws/internal/service/appstream 32248636
github.com/hashicorp/terraform-provider-aws/internal/service/guardduty 32197636
github.com/hashicorp/terraform-provider-aws/internal/service/m2 31757684
github.com/aws/aws-sdk-go-v2/service/securityhub 31605124
github.com/hashicorp/terraform-provider-aws/internal/service/resourceexplorer2 31479248
github.com/hashicorp/terraform-provider-aws/internal/service/osis 31141590
github.com/hashicorp/terraform-provider-aws/internal/service/redshift 31092528
github.com/hashicorp/terraform-provider-aws/internal/service/s3control 30801928
github.com/hashicorp/terraform-provider-aws/internal/service/amp 30617456
github.com/aws/aws-sdk-go-v2/service/iam 30602038

@ringods
Copy link
Member

ringods commented Aug 20, 2024

@t0yv0 additional feedback from the customer about the impact of the plugin binary file size growth:

It is mainly two things:

  • download times
  • space on disk, hat to delete 50GB of plugins from disk yesterday

@t0yv0
Copy link
Member Author

t0yv0 commented Aug 20, 2024

This makes sense Ringo, thanks for that detail. With 50GB of plugins, I am wondering if something can be done at the plugin cache level, some form of scheduled eviction, as it appears multiple copies of provider(s) are involved there.

@tobiashenkel
Copy link

This makes sense Ringo, thanks for that detail. With 50GB of plugins, I am wondering if something can be done at the plugin cache level, some form of scheduled eviction, as it appears multiple copies of provider(s) are involved there.

A bit of context there. This grew over the course of roughly half a year while maintaining 50+ pulumi stacks each with its own repo with its own dependencies which are regularly updated in a semi automated way.

@flostadler flostadler added kind/engineering Work that is not visible to an external user and removed needs-triage Needs attention from the triage team labels Aug 20, 2024
@t0yv0
Copy link
Member Author

t0yv0 commented Aug 20, 2024

I think there's a feature request in pulumi/pulumi that could be helpful in a situation like this: pulumi/pulumi#7505

I will cross-link and add some ideas there.

We would love to reduce the binary size of pulumi-aws but since it appears to be dominated by terraform-provider-aws binary size it appears to be a very difficult undertaking that is unlikely to get prioritized in the short term. Hence we need to be considering broadly what else can we do to alleviate the end-user problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/engineering Work that is not visible to an external user
Projects
None yet
Development

No branches or pull requests

5 participants