Tkinter – Use characters/bytes offset as index for text widget

Question:

I want to delete part of a text widget’s content, using only character offset (or bytes if possible).

I know how to do it for lines, words, etc. Looked around a lot of documentations:

Here is an example mre:

import tkinter as tk

root = tk.Tk()

text = tk.Text(root)

txt = """Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Suspendisse enim lorem, aliquam quis quam sit amet, pharetra porta lectus.
Nam commodo imperdiet sapien, in maximus nibh vestibulum nec.
Quisque rutrum massa eget viverra viverra. Vivamus hendrerit ultricies nibh, ac tincidunt nibh eleifend a. Nulla in dolor consequat, fermentum quam quis, euismod dui.
Nam at gravida nisi. Cras ut varius odio, viverra molestie arcu.

Pellentesque scelerisque eros sit amet sollicitudin venenatis.
Proin fermentum vestibulum risus, quis suscipit velit rutrum id.
Phasellus nisl justo, bibendum non dictum vel, fermentum quis ipsum.
Nunc rutrum nulla quam, ac pretium felis dictum in. Sed ut vestibulum risus, suscipit tempus enim.
Nunc a imperdiet augue.
Nullam iaculis consectetur sodales.
Praesent neque turpis, accumsan ultricies diam in, fermentum semper nibh.
Nullam eget aliquet urna, at interdum odio. Nulla in mi elementum, finibus risus aliquam, sodales ante.
Aenean ut tristique urna, sit amet condimentum quam. Mauris ac mollis nisi.
Proin rhoncus, ex venenatis varius sollicitudin, urna nibh fringilla sapien, eu porttitor felis urna eu mi.
Aliquam aliquam metus non lobortis consequat.
Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Aenean id orci dui."""

text.insert(tk.INSERT, txt)


def test_delete(event=None):
    text.delete() # change this line here

text.pack(fill="both", expand=1)
text.pack_propagate(0)
text.bind('<Control-e>', test_delete)
root.mainloop()

It display an example text inside a variable, inside a text widget. I use a single key binding to test some of the possible ways to do what I want on that piece of text.

I tried a lot of things, both from the documentation(s) and my own desperation:

  • text.delete(0.X): where X is any number. I thought since lines were 1.0, maybe using 0.X would work on chars only. It only work with a single char, regardless of what X is (even with a big number).
  • text.delete(1.1, 1.3): This act on the same line, because I was trying to see if it would delete 3 chars in any direction on the same line. It delete 2 chars instead of 3, and it does so by omitting one char at the start of the first line, and delete 2 char after that.
  • text.delete("end - 9c"): only work at the end (last line), and omit 7 chars starting from EOF, and then delete a single char after that.
  • text.delete(0.1, 0.2): Does not do anything. Same result for other 0.X, 0.X combination.

Example of what I try to achieve:

Using the example text above would take too long, so let’s consider a smaller string, say "hello world".
Now let’s say we use an index that start with 1 (doesn’t matter but make things easier to explain), the first char is "h" and the last one is "d". So say I use chars range such as "2-7", that would be "ello w". Say I want to do "1-8"? -> "hello wo", and now starting from the end, "11-2", "ello world".

This is basically similar to what f.tell() and f.seek() do. I want to do something like that but using only the content inside of the text widget, and then do something on those bytes/chars ranges (in the example above, I’m deleting them, etc).

Asked By: Nordine Lotfi

||

Answers:

I emulated f.seek and combined it with text.delete. It seems what you basically was missing was that you need to take the insertion cursor into account. See the comments in the code

def seek_delete(offset, whence):
    if whence == 0: #from the beginning
        start = '1.0'
        end = f'{start} +{offset} chars'
    elif whence == 1:# from insertion cursor
        current = 'insert'
        if offset >= 0:#positive offset
            start = current
            end = f'{start} +{offset} chars'
        else:#negative offset
            start = f'{current} {offset} chars'
            end = current
    elif whence == 2:#from the end
        start = f'end {offset} chars'
        end = 'end'
    text.delete(start, end)

I have tested it with different values with this binding:

text.bind('<Control-e>', lambda e:seek_delete(-2,1))

As a bonus, you can emulate f.tell quite easy like this:

def tell(event):
    print(text.index('insert'))
Answered By: Thingamabobs

TL;DR

You can use a relative index similar to f.tell() by giving a starting index and then add or remove lines or characters. For example, text.delete("1.0", "1.0+11c") ("1.0" plus 11 characters)

The canonical documentation for text widget indexes is in the tcl/tk man pages in a section named Indices.


text.delete(0.X): where X is any number. I thought since lines were 1.0, maybe using 0.X would work on chars only. It only work with a single char, regardless of what X is (even with a big number).

I don’t know what you mean by "since lines were 1.0". The first part of the index is the line number, the second is the character number. Lines start counting at 1, characters at zero. So, the first character of the widget is "1.0". The first character of line 2 is "2.0", etc.

But yes, text.delete with a single index will only delete one character. That is the defined behavior.

text.delete(1.1, 1.3): This act on the same line, because I was trying to see if it would delete 3 chars in any direction on the same line.

The delete method is documented to delete from the first index to the character before the last index:

"Delete a range of characters from the text. If both index1 and index2 are specified, then delete all the characters starting with the one given by index1 and stopping just before index2"

text.delete("end – 9c"): only work at the end (last line), and omit 7 chars starting from EOF, and then delete a single char after that.

Yes. Again, a single index given to delete will delete just a single character.

text.delete(0.1, 0.2): Does not do anything. Same result for other 0.X, 0.X combination.

0.1 is an invalid index. An index is a string, not a floating point number, and the first number should be 1 or greater. Tkinter has to convert that number to a whole number greater than or equal to 1.So, both 0.1 and 0.2 are both converted to mean "1.0". Like I said earlier, the delete method stops before the second index, so you’re deleting everything before character "1.0".

Using the example text above would take too long, so let’s consider a smaller string, say "hello world". Now let’s say we use an index that start with 1 (doesn’t matter but make things easier to explain), the first char is "h" and the last one is "d". So say I use chars range such as "2-7", that would be "ello wo". Say I want to do "1-8"? -> "hello wo", and now starting from the end, "11-2", "ello world".

If "hello world" starts at character position "1.0", and you want to use a relative index to delete a range characters, you can delete it with something like text.delete("1.0", "1.0+11c") ("1.0" plus 11 characters)

Answered By: Bryan Oakley

Based on my own relentless testing and other answers here, I managed to get to a solution.

import tkinter as tk
from tkinter import messagebox  # https://stackoverflow.com/a/29780454/12349101

root = tk.Tk()

main_text = tk.Text(root)

box_text = tk.Text(root, height=1, width=10)
box_text.pack()

txt = """hello world"""

len_txt = len(
    txt)  # get the total length of the text content. Can be replaced by `os.path.getsize` or other alternatives for files

main_text.insert(tk.INSERT, txt)


def offset():
    inputValue = box_text.get("1.0",
                              "end-1c")  # get the input of the text widget without newline (since it's added by default)

    # focusing the other text widget, deleting and re-insert the original text so that the selection/tag is updated (no need to move the mouse to the other widget in this example)
    main_text.focus()
    main_text.delete("1.0", tk.END)
    main_text.insert(tk.INSERT, txt)


    to_do = inputValue.split("-")

    if len(to_do) == 1:  # if length is 1, it probably is a single offset for a single byte/char
        to_do.append(to_do[0])

    if not to_do[0].isdigit() or not to_do[1].isdigit():  # Only integers are supported
        messagebox.showerror("error", "Only integers are supported")
        return  # trick to prevent the failing range to be executed

    if int(to_do[0]) > len_txt or int(to_do[1]) > len_txt:  # total length is the maximum range
        messagebox.showerror("error",
                             "One of the integers in the range seems to be bigger than the total length")
        return  # trick to prevent the failing range to be executed

    if to_do[0] == "0" or to_do[1] == "0":  # since we don't use a 0 index, this isn't needed
        messagebox.showerror("error", "Using zero in this range isn't useful")
        return  # trick to prevent the failing range to be executed

    if int(to_do[0]) > int(to_do[1]):  # This is to support reverse range offset, so 11-2 -> 2-11, etc
        first = int(to_do[1]) - 1
        first = str(first).split("-")[-1:][0]

        second = (int(to_do[0]) - len_txt) - 1
        second = str(second).split("-")[-1:][0]
    else:  # use the offset range normally
        first = int(to_do[0]) - 1
        first = str(first).split("-")[-1:][0]

        second = (int(to_do[1]) - len_txt) - 1
        second = str(second).split("-")[-1:][0]

    print(first, second)
    main_text.tag_add("sel", '1.0 + {}c'.format(first), 'end - {}c'.format(second))


buttonCommit = tk.Button(root, text="use offset",
                         command=offset)
buttonCommit.pack()
main_text.pack(fill="both", expand=1)
main_text.pack_propagate(0)
root.mainloop()

Now the above works, as described in the "hello world" example in my post. It isn’t a 1:1 clone/emulation of f.tell() or f.seek(), but I feel like it’s close.

The above does not use text.delete but instead select the text, so it’s visually less confusing (at least to me).

It works with the following offset type:

  • reverse range: 11-2 -> 2-11 so the order does not matter
  • normal range: 2-11, 1-8, 8-10
  • single offset: 10 or 10-10 so it can support single char/byte

Now the main thing I noticed, is that '1.0 + {}c', 'end - {}c' where {} is the range, works by omitting its given range.

If you were to use 1-3 as a range on the string hello world it would select ello wor. You could say it omitted h and ldn, with the added newline by Tkinter (which we ignore in the code above unless it’s part of the total length variable). The correct offset (or at least the one following the example I gave in the post above) would be 2-9.

P.S: For this example, clicking on the button after entering the offsets range is needed.

Answered By: Nordine Lotfi
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.