avatarLiu Zuo Lin

Summary

The article discusses a Python string compression challenge that the author found particularly difficult as a beginner, detailing the logic and code behind the solution, and providing examples and an uncompression function.

Abstract

The author reflects on a challenging Python problem encountered early in their learning journey, which involved compressing a string by representing consecutive repeated characters with the character followed by the count of repetitions. The article explains the thought process and logic for solving this problem, including maintaining a current character count and appending the character and count to an output string when a new character is encountered. The author provides a step-by-step explanation of how the compression function processes the input string "aaaabbbcca" and shares the Python code for both the compression and decompression functions. The article concludes with the author's insights on the learning experience and an invitation for readers to support their work by subscribing to Medium or buying them a coffee.

Opinions

  • The author acknowledges that the compression algorithm may not be the most efficient, as it can increase the length of some strings, but emphasizes its value as a learning exercise.
  • The author admits to spending days on this problem due to unfamiliarity with the concept of counters at the time, suggesting that mastering basic programming concepts is crucial for problem-solving.
  • The article implies that the process of struggling with and eventually solving complex problems is a significant part of learning to program.
  • By sharing their personal experience, the author encourages beginner programmers to persevere through challenging problems to enhance their understanding and programming skills.
  • The author believes that their article can be beneficial for new Python learners to strengthen their basic concepts and improve their problem-solving abilities.
  • The author provides a call to action for readers to support their writing by becoming Medium members or contributing financially, indicating a desire for reader engagement and support for their content creation efforts.

This Simple Python Question Took Me Ridiculously Long To Solve As A Beginner

When I was still a Python beginner back in 2017, I did occasionally meet with questions that took me a couple of days or even weeks to solve, but it was attempting to solve these questions that made me learn the most. Here is probably the first ever question that took me ridiculously long to solve when I was still a Python beginner.

The Question — Compressing A String

Consider a string "aaaabbbcca". One simple way that we can compress this string would be to rewrite it in terms of 1) each letter and 2) how many times it appears in a certain group of letters.

For instance, in "aaaabbbcca", we can visualize 4 groups of letters — 1) "aaaa" 2) "bbb" 3) "cc" and 4) "a". We can then rewrite this string as "a4b3c2a1", as "a" appears 4 times, "b" appears 3 times, "c" appears 2 times, and the second "a" appears only once. Note that the 1st and 4th groups are separate as the "a" characters don’t occur one after another.

Our task would be to write a function compress that takes in 1 string and returns the compressed version of the input string.

NOTE: assume that no letter appears more than 9 times in a row

More Examples

aaaabbbcca --> a4b3c2a1
abca --> a1b1c1a1
aaaabbb11112222 --> a4b31424
apple -> a1p2l1e1

Note: Yes, I know this is not the best compression algorithm as some words actually get longer, but let’s just view this as a way to practice our Python

The Logic Behind The Solution

  1. Iterate through every single character in our input string
  2. Keep track of 3 variables — out (our output), current (the current letter) & count (number of times current appears)
  3. If we meet a letter that is the same as current, we increment count by 1
  4. Else, we add both current and count to our output out.
  5. At the end of the for loop, add current and count to out 1 last time

The Code Behind The Solution

Running Through The Code

Let’s take a look at what happens at every iteration when we run this function with the string "aaaabbbcca". At this point, out="", current="" & count=0

1st Iteration, ch = ”a”

  • out="", current="", count=0
  • ch="a"ch is not equals to current.
  • out becomes "0" (we will remove this later)
  • current becomes "a" and count becomes 1

2nd Iteration, ch = “a”

  • out="0", current="a", count=1
  • ch="a"ch is equal to current
  • current remains as "a", and count increments to 2

3rd Iteration, ch = “a”

  • out="0", current="a", count=2
  • ch="a"ch is equal to current
  • count increments to 3

4th Iteration, ch = “a”

  • out="0", current="a", count=3
  • ch="a"ch is equal to current
  • count increments to 4

5th Iteration, ch = “b”

  • out="0", current="a", count=4
  • ch="b"ch is not equal to current
  • out += current + str(count)out becomes "0a4"
  • current="b" & count=1

6th Iteration, ch = “b”

  • out="0a4", current="b", count=1
  • ch="b"ch is equal to current
  • count increments to 2

7th Iteration, ch = “b”

  • out="0a4", current="b", count=2
  • ch="b"ch is equal to current
  • count increments to 3

8th Iteration, ch = “c”

  • out="0a4", current="b", count=3
  • ch="c"ch is not equal to current
  • out += current + str(count)out becomes "0a4b3"
  • current="c", count=1

9th Iteration, ch = “c”

  • out="0a4b3", current="c", count=1
  • ch="c"ch is equal to current
  • count increments to 2

10th Iteration, ch = “a”

  • out="0a4b3", current="c", count=1
  • ch="a"ch is not equal to current
  • out += current + str(count)out becomes "0a4b3c2"
  • current="a", count=1

End of iteration

  • out += current + str(count)out becomes "0a4b3c2a1"
  • return out[1:]"a4b3c2a1" is returned

Code To Uncompress Our Compressed Strings

def uncompress(string):
    out = ""
    while len(string) > 0:
        letter, count, *string = string
        out += letter * int(count)
    return out

Some Final Words

While this seemingly extremely simple Python question can probably be solved by many of us relatively quickly, I actually spent days trying to figure this question out back in 2017 — I probably wasn’t that familiar with the concept of creating a counter at that time or something. Hopefully this article would be useful for newer Python learners in strengthening their basic concepts and ability to think like a programmer.

Conclusion

If this article provided value and you wish to support me as a writer, do consider signing up for a Medium membership — It’s $5 a month, and you get unlimited access to stories on Medium. If you sign up using my link below, I’ll earn a tiny commission at zero additional cost to you.

Sign up using my link here to read unlimited Medium articles!

If this article provided immense value for you, do consider buying me a coffee — every small contribution is appreciated greatly!

https://www.buymeacoffee.com/zlliu

If you wish to get notified whenever I publish, do consider joining my email list.

Python
Recommended from ReadMedium