More

pwang · on Sept 2, 2024

What if you want to ensure that the same packages are available, built in a similar way, between your Mac and some linux servers? What if you need to share or ensure that your projects work between Linux and Windows? What if you are supporting a lot of less-sophisticated users across a number of different OSes, or even different versions of the same OS?

There are a ton of subtleties in the build toolchain even within a single OS type, and these will lead to downstream frustrations with packages that just bundle pre-compiled versions of C/C++ libraries.

The conda approach treats the underlying libraries as first-class citizens in the package ecosystem; tracks their interdependencies; and most packages are built in such a way that they are relocatable on your filesystem and don't require system privileges to install. conda and the various packages in the conda universe (whether official ones from Anaconda or the community-built ones in conda-forge) all make it so that this baseline hard problem is mostly solved across all major OSes, and solved in a relatively consistent way.

You can think of conda as a cross-platform, cross-architecture, multi-language, userspace package manager. It grew into this because numerical Python's package ecosystem is so horribly complex and the users span such a huge set of install environments, that conda ended up having to be a generic rpm/brew/apt kind of thing.

pwang · on Sept 2, 2024

I posted an update about this: https://www.linkedin.com/posts/pzwang_hi-everyone-recently-t...

TLDR - we are working to clean up this language to leave it clear that educational institutions are exempt, and that these commercial terms do not apply to third-party channels hosted at anaconda.org (which includes conda-forge).

adolph · on Sept 2, 2024

You’ve had the existing language up for years. Your licensing regime is a dark pattern torpedoing anything Good about Anaconda. Anaconda is threatening academic researchers for back usage in something those researchers thought was free.

Show us don’t tell us. I’d love to not have to continue the rip and replace job ahead.

Edit: this linkedin response sums it up well. Not certain what guarantee you will make to research institutions’ leaders that is going to lift those blocks.

However, please tell this to your sales/legal teams: the nature of their approaches has been somewhat poisoning the well. If leadership’s first encounter with a piece of software is: "you are in legal trouble", the reaction you're going to get is: "remove and block that legally dangerous piece of software at once", rather than "oh yes we should license it". We do buy licenses for software, we view it as giving back, but how the offering is first presented matters.

sunshowers · on Sept 3, 2024

Thank you. I do hope you reflect on how ambiguous the language was allowed to be for years.

pwang · on Aug 16, 2024

Fun joke but Anaconda has a track record of creating OSS and then turning it over to community governance. This includes the conda tool itself, libraries like bokeh, dask, numba, jupyterlab, and many more. And while PyScript project governance isn't in NumFOCUS, all of the code is permissively-licensed BSD/MIT.

The commercial licenses for the products and commercial repository is what supports all of this OSS development work.

pwang · on Aug 16, 2024

Anaconda Code is what you're looking for! It's Python (via PyScript) running as an Excel plug-in, that has full access to the spreadsheet and can harness a big part of the core PyData stack (including matplotlib, sklearn, pandas, etc.)

The whole thing runs via PyScript/WASM, and lives locally inside the Excel spreadsheet. https://www.anaconda.com/blog/introducing-anaconda-code-add-...

anakaine · on Aug 17, 2024

Please don't tell people I work with about this.

pwang · on May 12, 2022

Over 50% of Python users report using it for data-related things, e.g. data science, ML, etc. (Jetbrains poll)

That's numpy, pandas, sklearn, and friends.

throwaway894345 · on May 13, 2022

Fair enough. I defer to JetBrains on that point.

pwang · on Feb 1, 2022

I feel you on this. Having done a lot of Python consulting for engineers and scientists, it is absolutely the case that most non-programmers have -zero- model of what I/O latency and bandwidth limitations look like. They are looking at programming APIs, and their mental models can include concepts like files and even byte layouts within files. But they generally have no working model of how a physical computer actually implements those things.

I've definitely seen file I/O in the middle of FORTRAN loops. Entirely correct from a functional perspective, and total disaster from an actual runtime perspective.

thetallstick · on Feb 2, 2022

> But they generally have no working model of how a physical computer actually implements those things.

To be fair, how many blog posts have been written over the years on programmers doing ridiculous things with SQL, or network IO, or file IO, etc. etc.? If trained programmers struggle with these things, I'm willing to give non-programmers some slack.

pwang · on Dec 6, 2021

The homepage that's linked from Github seems to be down? https://renpy.beuc.net/

pwang · on Nov 16, 2021

Thanks for the feedback... I had no idea that pyenv was hosing conda envs. Do you have more details on this brokenness?

nerdponx · on Nov 16, 2021

My understanding of the problem is that Pyenv attempts to detect the contents of "/bin" relative to the top level of every Python installation that it manages.

It does this so that it can set up its shim to handle any executable that gets installed in any Pyenv-managed environment.

This is how Pyenv creates the "foobar is not available in your current environment, but is available in x.y.z" message. It's also a much more reliable solution than trying to explicitly whitelist every possible script that might get installed.

The problem is that this was only designed to work for Python executables and scripts installed by Pip. Conda environments can contain a lot more than that; it's not hard to end up with an entire C compiler toolchain in there (possibly even both GCC and LLVM) or even Coreutils.

If Pyenv detects `bin/gcc` in a Conda env, it will set up a system-wide shim for GCC, which no longer passes the `gcc` command along to the OS, but intercepts it, only to inform you that no such command exists in the current env!

So it's not that Pyenv hoses Conda envs. It's that Pyenv can hose PATH if you have it manage a Conda installation, and if that Conda installation ends up with non-Python stuff in `/bin`.

Obviously I don't know what exactly was broken when you tried to set up that application. But this particular adverse interaction bit me at work a few years ago, and ever since then I have insisted that Pyenv should never manage a Conda installation.

I think that's a reasonable policy anyway, in light of the facts that:

1) Conda isn't really a "Python distribution" anymore.

2) The Pyenv installer just runs the opaque Conda installer script and there's basically no way to control the version that gets installed.

3) They are different tools that serve different purposes and it doesn't make sense to have one manage the other anyway.

4) You probably shouldn't use the Python that's installed in the base Conda environment anyway. You need that to run Conda itself, and you want to keep the list of requirements small to make sure that updates can progress cleanly. It's basically the same as any Linux package manager like APT. Except of course, those tools don't generally support "environments" other than chroot.

pwang · on Nov 16, 2021

This is one of the reasons people use Anaconda/miniconda for non-data science work: conda environments are self-contained Python installs, so if you conda/pip install packages into those environments, they will not break each other. This design requirement arose from the specific needs of numerical computing (which always drags in a ton of system-level C/C++/FORTRAN dependencies), but is a generically useful design construct.

Anaconda is a distro, and conda is a package manager, that works across OS platforms and hardware architectures, and installs cleanly into userland without requiring admin privileges. The only way we achieve this difficult goal is by creating a distro and build system that creates "portable" packages that can be relocated/relinked at install-time.

Ultimately, Python's challenges in this department come from the fact that it has such great integration with low-level C/C++ libraries. This gives it super powers as duct tape/glue language, but it also drags it down into the packaging tech debt of C/C++. Hmm... maybe I should write that blog post: "Python Packaging Isn't The Problem; C/C++ Is." :-)

pwang · on Nov 16, 2021

Thanks!

FWIW, we are soon going to be releasing a much faster depedency resolver. We are also thinking hard about how best to address the "growing ecosystem" problem, in a future-proof way.

dagw · on Nov 16, 2021

FWIW, we are soon going to be releasing a much faster depedency resolver

Fantastic. Easily my biggest complaint about conda.